We might be entering to the era of Diffusion in LLM! Google's new modal hints

Google's new model, "Cheetah," achieves about 400 tokens per second. And judging by its ability to restructure messy but functional code into a nicely structured version which GPT-5 High failed at repeatedly, it's likely a diffusion model, not a transformer-based one.

We might be entering to the era of Diffusion in LLM! Google's new modal hints

Listen to this article