- The AI Timeline
- Posts
- Aug~Nov AI Research Trend Report
Aug~Nov AI Research Trend Report
Basically recapping what I missed in the last 4 months
Table of Contents
Some of you might know that I was gone for 4 months. (August to November due to mandatory military service).
For those of you who don’t, well that’s why I am recapping all the papers properly that I missed in the last 4 months here.
This list of papers are what I found to have fascinating ideas, potentially pivotal, or discuss/propose critical updates to the current AI research landscape.
The focus is as usual the LLM papers, but I do have some non-LLM papers too that are pretty big in their own field.
What the SoTA Is Doing
The Hybrid Attention Saga
In the 4 months I have been gone, the Chinese open source community has undergone a pretty active discussion surrounding hybrid attention. I think this tweet sums it up the best.
So what was the timeline? And can we figure out what’s actually going on?
The trajectory of attention mechanisms while I was gone had a period of quick optimism change for hybrid attention. Thanks to being open source, discussion was publicly shared through papers, blogs, or even X. For context, hybrid architectures are usually a combination of both linear attention & standard attention. For the sake of simplicity, I’ll only be covering the blogs or papers. X’s discussion was all over the place, without much empirical evidence to base off on.
