Archive | The AI Timeline

The AI Timeline
Archive
Page -12534

Archive

Long Context Pre-Training w/ Lighthouse Attention

plus more about Self-distilled Agentic RL, Embedded Language Flows, and Negation Neglect

by cloud

Think In Diffusion: Continuous Latent Diffusion Language Model

plus more on Sparser, Faster, Lighter Transformer LMs, Manifold Steering, and Teaching Claude Why

by cloud

DeepSeek's Deleted Paper: Thinking With Visual Primitives

can't believe they removed this paper unknowningly

by cloud

There Will Be a Scientific Theory of Deep Learning

plus more about Hyperloop Transformer, Qwen-3.5 Omni, and Scaling Self-Play with Self-Guidance

Kimi Moonshot: Prefill-as-a-Service!?

plus more about Looped Transformers, Nexus, RNN with Memory, and more

by cloud

Neural Computer: Running an OS within an AI?!

plus more about In-Place TTT, TriAttention, and Interleaved Head Attention.

by cloud

Embarrassingly Simple Self-Distillation Technique

plus more on Path-Constrained MoE, HISA, and Screening is not enough

by cloud

weekly papers recapweekly papers recap

LeWorldModel: JEPA but more practical

plus more on Claudini, Composer 2, and self-distillation

by cloud

Rotate attention by 90 degrees...? Kimi's New Attention Residuals

plus more about V-JEPA 2.1, Mamba 3, and latent planning

by cloud

You can train OpenClaw just by talking to it?

and more about GLM-OCR, pre-pre-training on NCA, IndexCache, and neural thickets

by cloud

Flash Attention 4 is nuts

and more about Speculative Speculative Decoding, SWE-CI, and Beyond Language Modeling

by cloud

Compress Context... Into a LoRA!?

plus more on Learning Without Training and The Geometry of Noise

by cloud

1 2 3 4 5 6 7 8