DeepSeek Strikes Again: More Stable, Cheaper Transformers

DeepSeek strikes again.

On January 1, 2026, DeepSeek published a new research paper: mHC (Manifold-Constrained Hyper-Connections). It's an improvement to the Transformer architecture that makes training more stable and cheaper.

A new flagship model is likely coming by mid-February.

Their playbook is always the same: first a technical paper, then the model. Liang Wenfeng, the founder, personally published this paper on arXiv (just as he did before for R1 and V3).

As a reminder: one year ago, DeepSeek released R1 and sent Nvidia's stock plunging 17% in a single day (-$600B in market cap)!

DeepSeek continues to play a key role in the new AI paradigm:

The gains from simple scaling are running out
LLMs keep improving through architectural enhancements and Reinforcement Learning
Costs are shifting toward inference (vs. training) but the price of tokens is collapsing (divided by 1,000 in 3 years), making it possible to increase usage without costs exploding

2026 is shaping up to be just as intense as 2025.

DeepSeek Strikes Again: Cheaper, More Stable AI