A few days ago, while researching text-to-video models, I noticed something that diverged significantly from my prior assumptions: in both text-to-image and text-to-video, Chinese companies have a much smaller presence in the open-source community than they do in language models.
I had assumed Chinese models were already leading across all fronts. But looking closer, that wasn't the case. This prompted me to review how the open-source community has changed over the past three years.
Model Open Source Is Not Software Open Source
First, something many people may not have considered.
With open-source software, once the source code is released, there are no secrets left—you can rebuild it from scratch. Models don't work that way. In most cases, open-sourcing a model means releasing the weights and inference scripts, while the core elements—training methods, datasets, and engineering details—remain undisclosed.
What can you do with the weights? Deploy inference, fine-tune. Reproduce from scratch? Nearly impossible. So "open source" for models was never the same thing as "open source" for software.
This led to a long-standing debate: since model open source differs from software open source, what's the point of open-sourcing models?
The greatest benefit of open-source software is community collaboration—developers worldwide fixing bugs and adding features. But after a model is open-sourced, very few people or institutions can actually participate in model R&D. You need massive compute, data, and training infrastructure. Most people can only run inference with the weights.
When Robin Li (Baidu's founder) said open-source large models were meaningless, I actually thought he had a point. If community collaboration—the biggest driver—doesn't apply, what's the point of open-sourcing?
Later events answered this question.
The Age of LLaMA
When ChatGPT emerged at the end of 2022, large models entered the public consciousness. Before that, the open-source world was rather dull. OpenAI's GPT-2 was fully open-sourced in 2019, but by GPT-3 it had shifted to an application-based API access model. I remember wanting to use GPT-3 at a hackathon and discovering I had to email for API access. It was essentially closed-source.
In 2023, Meta's LLaMA dominated the open-source community. LLaMA 1 arrived in February 2023, LLaMA 2 in July. Every time LLaMA released a new version, a wave of domestic Chinese models would announce upgrades. This was the so-called "hundred-model war," with everyone dancing to LLaMA's tune.
At this stage, Chinese models were open-sourced mainly for promotion. The released versions were all relatively small. Zhipu's GLM-6B was the earliest representative—many people's first exposure to on-premises large model deployment started with it. I remember a friend choosing a Chinese model at the time and wondering why; he said it was from Tsinghua and had a good reputation. Baichuan open-sourced a 14B model. In November 2023, 01.AI (founded by Kai-Fu Lee) open-sourced Yi-34B, which was relatively large for Chinese open-source models at the time. The Shanghai Artificial Intelligence Laboratory also continued its InternLM (Shusheng) series.
Everyone followed the same strategy: open-source the small models for promotion, keep the large models closed for commercialization.
Qwen Enters the Game
In 2024, this equilibrium was shattered by Qwen.
Starting mid-2024, Alibaba's Qwen went all in, releasing open-source models ranging from a few billion to 72B parameters, all with strong performance. It had been assumed that large models wouldn't be open-sourced, and then suddenly someone released highly capable large models for free.
Although LLaMA had significant international influence, its Chinese capabilities were poor, requiring secondary training for practical use. Qwen worked almost out of the box for Chinese scenarios and quickly replaced LLaMA's position in the Chinese open-source community.
By the end of 2024, the Qwen series had become the de facto standard for Chinese open-source models. Closed-source models felt pressure from open-source models for the first time.
DeepSeek Flipped the Table
Qwen played the existing rules perfectly. DeepSeek changed the rules entirely.
DeepSeek entered the scene in 2024 with a simple strategy: open-source everything from day one, and publish extremely thorough technical reports. At the end of 2024, DeepSeek V3 was released—hundreds of billions of parameters, top-tier performance, open-sourced immediately upon release. At that time, few had seen a model of that scale released openly.
But what truly exploded was R1 in January 2025.
OpenAI had just launched its O1 reasoning model in September 2024, and DeepSeek's R1 came out before the Spring Festival. Its reasoning capabilities were remarkably close to the top closed-source models at the time—not quite on par, but the gap was surprisingly small. And it was fully open-sourced on day one.
The previous order had been "small models open, large models closed." DeepSeek open-sourced something better than many companies' best closed-source models, and that order collapsed overnight.
LLaMA 4 provides another footnote. Meta spent a long time training a massive model to reclaim its position in the open-source community, launching it in April 2025, only to see it crash and burn. Performance fell far short of expectations, and benchmark-cheating allegations surfaced. Later, Yann LeCun himself admitted that "results were fudged", and Zuckerberg lost confidence in the entire GenAI team. LLaMA 4 was essentially unused; the LLaMA series' standing in the open-source community ended there.
Day-0 Open Source Became the Industry Norm
After DeepSeek, Chinese model companies shifted one after another to day-0 open source, releasing their best models on the same day they were announced.
Kimi open-sourced K2 in July 2025, a trillion-parameter MoE model. MiniMax open-sourced M2.5. Zhipu continued iterating the GLM series. Wave after wave, the quality of open-source models kept rising.
Today, if you look at international open-source language model leaderboards, they're almost entirely Chinese models. Qwen, DeepSeek, GLM, MiniMax, Kimi—the presence of overseas models has become very weak.
DeepSeek didn't just contribute a model. It changed how the entire industry plays its hand.
But the Winds Are Shifting Again
However, the day-0 open-source fever is cooling down.
DeepSeek's last open-source release was V3.2 in December 2025—over four months ago. V4 has been rumored for a long time but hasn't materialized. During this window, other companies' strategies have begun to waver.
Qwen 3.6 Plus was released at the end of March 2026 without open-sourcing, API-only. This was the first time a flagship Qwen model wasn't open-sourced. Zhipu's GLM-5.1 was also released closed-source first, though it just announced weight open-sourcing in the past couple of days. Many companies' latest multimodal models are no longer open-sourced either.
It seems we're back to the old question: what's the point of open-sourcing? When competitive pressure decreases, the answer may change again.
Text-to-Image and Text-to-Video Are Still Waiting
Back to the surprising finding that started this whole reflection.
The text-to-image open-source community is still dominated by overseas models. The most widely used are Stability AI's Stable Diffusion series and Black Forest Labs' FLUX series. Chinese models have made some progress—Qwen released Qwen-Image, Tencent has Hunyuan Image 3.0, and Zhipu has GLM-Image. But compared to language models, the difference is stark.
The same goes for text-to-video. Alibaba's last open-source text-to-video model was Wan 2.2 in July 2025—nearly nine months ago with nothing new since. The most talked-about open-source text-to-video model recently is LTX-2 from Israeli company Lightricks, which open-sourced its weights in January 2026.
This is an entirely different world from language models. On the language model side, Chinese models have filled the entire open-source community; on the text-to-image and text-to-video side, it still looks more like 2023: overseas models dominate, with Chinese models appearing sporadically. The entire open-source community is experiencing a "short-video-style" explosion, but so far this explosion has mainly occurred in language models and software tools.
What Are We Waiting For
Everyone is waiting for DeepSeek V4.
But they're waiting for more than just a model. DeepSeek previously proved something: a sufficiently powerful model fully open-sourced on release day can shift the strategic direction of an entire industry. And from a business perspective, the endgame for model companies is becoming cloud companies—the business behind open-sourcing models is far larger than imagined. This happened in language models; it has yet to happen in text-to-image and text-to-video.
I sometimes half-jokingly think that maybe DeepSeek just needs to do it one more time. But then again, DeepSeek itself hasn't released a new model in over four months either.
