Aug~Nov AI Research Trend Report

Basically recapping what I missed in the last 4 months

Table of Contents

Some of you might know that I was gone for 4 months. (August to November due to mandatory military service).

For those of you who don’t, well that’s why I am recapping all the papers properly that I missed in the last 4 months here. 

This list of papers are what I found to have fascinating ideas, potentially pivotal, or discuss/propose critical updates to the current AI research landscape.

The focus is as usual the LLM papers, but I do have some non-LLM papers too that are pretty big in their own field.

What the SoTA Is Doing

The Hybrid Attention Saga

In the 4 months I have been gone, the Chinese open source community has undergone a pretty active discussion surrounding hybrid attention. I think this tweet sums it up the best.

So what was the timeline? And can we figure out what’s actually going on?

The trajectory of attention mechanisms while I was gone had a period of quick optimism change for hybrid attention. Thanks to being open source, discussion was publicly shared through papers, blogs, or even X. For context, hybrid architectures are usually a combination of both linear attention & standard attention. For the sake of simplicity, I’ll only be covering the blogs or papers. X’s discussion was all over the place, without much empirical evidence to base off on. 

Subscribe to our premium insights to read more

Become a paying subscriber to get access to this post and other subscriber-only content.

Already a paying subscriber? Sign In.