Skip to main content
Blog

Model Companies' Endgame Is Becoming Cloud Companies

How do model companies make money with open source? Looking at DeepSeek's numbers makes it clear: model companies and cloud companies are becoming two sides of the same coin, and open source is currently the most efficient customer acquisition method.

6 min read
Share:
Model Companies' Endgame Is Becoming Cloud Companies

People often ask: How do open-source models make money?

Actually, reframe the question and it becomes clearer. How does open-source software make money? Hosted cloud services. Redis is open source, Redis Cloud makes money. MongoDB is open source, Atlas makes money. Models are even more so, and are better suited for this path than open-source software.

The previous post discussed how the open-source community has changed over the past three years; this one talks about money—how the business behind open-source models actually works.

Two Sides of the Same Coin

Look at the current global landscape: cloud providers are desperately building models, while model companies are desperately buying compute.

Google's 2025 capital expenditure exceeds 90billion,centeredaroundselfdevelopedGeminiandselfdevelopedTPUs.Microsoft,partneredwithOpenAI,haspoured90 billion, centered around self-developed Gemini and self-developed TPUs. Microsoft, partnered with OpenAI, has poured 80 billion into building AI data centers. Amazon has invested over 100billiontoexpandAWScompute,whilealsoinvesting100 billion to expand AWS compute, while also investing 4 billion in Anthropic. The three companies' combined capital expenditure in 2025 alone exceeds $300 billion, with the vast majority going to AI.

Now look at the model companies. OpenAI's 2025 revenue exceeds $20 billion, primarily from APIs and subscriptions—essentially selling inference compute. Anthropic signed a usage agreement with Google Cloud for millions of TPUs worth tens of billions of dollars, while simultaneously running over 500,000 Trainium chips on AWS. Tell me, is this a model company or a cloud company?

Both sides are becoming the same thing.

Running the Numbers on DeepSeek

DeepSeek demonstrated this perfectly.

On January 20, 2025, R1 was released during the Spring Festival holiday. Six days later, the app hit #1 on the US iOS download charts, simultaneously topping the charts in 52 countries. January saw over 14 million downloads, with monthly active users approaching 100 million by April. No ad spend, no marketing campaigns—customer acquisition cost was essentially zero.

The API pricing was aggressive too. R1 is priced at 0.55permillioninputtokens;OpenAIscomparableo1is0.55 per million input tokens; OpenAI's comparable o1 is 15. That's roughly 3.5% of OpenAI's price, far more aggressive than the "one-fifth of OpenAI's price" people were talking about. Many said this was selling at a loss for publicity, impossible to make money.

At the end of February, DeepSeek held an "Open Source Week," releasing five underlying optimization technologies over five days: FlashMLA, DeepEP, DeepGEMM, DualPipe, and 3FS—from attention decoding to matrix operations to pipeline parallelism to distributed file systems, all infrastructure-level components built in-house. DeepGEMM's core code is only 300 lines, yet outperforms expert-tuned kernels. Only then did people realize how much work this company had done at the infrastructure level.

Then on March 1st, DeepSeek released a set of data: calculating based on H800 rental prices at 2/hour,thedailyinferenceGPUcostfortheV3andR1modelswasapproximately2/hour, the daily inference GPU cost for the V3 and R1 models was approximately 87,000. If all traffic that day were billed at R1's pricing, theoretical daily revenue would be approximately $562,000. The theoretical cost-profit ratio: 545%.

Of course, this 545% needs to be discounted. DeepSeek themselves said—the web interface and app are both free, V3 is priced lower than R1, and there are off-peak discounts; actual monetizable traffic is far less than total traffic. This number only counts inference GPU rental fees, excluding training costs, R&D investment, and personnel expenses. The actual total R&D cost for V3 is estimated by the industry to be between 500millionand500 million and 1.6 billion.

But the 545% itself isn't the point. The point is: with the same open-source model, if others run inference services at this pricing, they'll likely lose money. Because DeepSeek has done extensive optimization at the infrastructure level, at the same price, they make money. Pricing power lies with the original manufacturer.

The Flywheel Spins

What's the most headache-inducing thing about the cloud business? Everyone sells roughly the same thing—bare metal with a thin layer of services on top, and gross margins get squeezed quickly. AWS's operating margin is roughly 33% to 38%, already the ceiling. Google Cloud went from years of losses to around 30%. Smaller cloud providers have even thinner margins, with highly concentrated customers; when big clients squeeze you on price, you have no leverage. No matter how much you invest in underlying technology, it's hard to translate that into customer-perceived differentiation.

Add a model layer and things change. Suppose I operate a large-scale inference cluster; if I improve model efficiency by 10%, the same hardware produces 10% more tokens. Take that extra profit and invest it in R&D, further optimizing inference efficiency, further driving down unit costs, then you can attack the market with lower prices, attract more users, drive up compute utilization, and profits increase again. Then invest more in R&D.

The old cloud business didn't have this cycle. You spent a lot of money on technical improvements, but customers couldn't perceive them. Models are different—inference efficiency optimizations directly translate to money: either lower costs, or more output at the same cost.

In 2025, Google increased capital expenditure from 75billionto75 billion to 93 billion, most of it going to AI infrastructure. What they're seeing is this shift: the model layer gives the cloud business real technical leverage.

Open Source Is the Most Efficient Customer Acquisition

Why not just sell closed-source? Because you can't tell how good a model is just by looking at benchmarks.

Llama 4 is a cautionary tale. In April 2025, Meta released Llama 4 Maverick, submitting it to LMArena where it ranked #2. It was quickly discovered that the version submitted to the leaderboard was a specially tuned "experimental version"—responses were exceptionally long, full of emojis, with fancy formatting—all tricks to game the scores. When the publicly released standard version was retested, it ranked #32. Later, when Yann LeCun left Meta, he personally admitted that "results were fudged." Zuckerberg lost confidence in the entire GenAI team, and the LLaMA series essentially exited the open-source community stage.

Benchmarks can be gamed; user experience cannot.

When a model is open-sourced, everyone can run it and test it. You know within minutes whether it's any good. This process builds word-of-mouth and creates stickiness. When I was helping friends set up AI client tools, someone pulled out a DeepSeek API account they had registered six months ago to connect. In that scenario, DeepSeek wasn't actually the optimal choice, but it felt convenient—they had registered, used it, and already had trust. Developers are similar: if they built a project using a particular model's API before, they'll probably use it for the next project too. Switching has costs; rebuilding trust costs even more.

DeepSeek's strategy is to combine open-source models with a free app. Developers test the open-source models; regular users test the app—building awareness on both fronts simultaneously. Some naturally convert to paid API users. Customer acquisition cost approaches zero, with broader reach than spending hundreds of millions on marketing.

Not All Models Fit This Path

This logic works smoothly for language models. DeepSeek R1 has hundreds of billions of parameters—you can't run it without a GPU cluster. If you want to use it, there are two paths: the free app or the paid API. Either way, the traffic is on their cloud. The capability gap between large and small models is significant, so users naturally gravitate toward the cloud.

Text-to-image is different. Open-source models like Stable Diffusion and FLUX can run on a single gaming GPU. The barrier is so low that individual users can deploy them at home. If the gap between large and small models isn't that significant, the market fragments—large numbers of users choose to run locally, and cloud demand isn't as concentrated.

Text-to-image and text-to-video have another push factor: involving image and video content, they naturally face more moderation and regulatory constraints. Cloud services must implement content filtering, but these constraints largely don't exist when running locally. This also pushes some users toward local deployment.

So whether this open-source model business logic works depends on whether the capability gap between large and small models is large enough, and whether the barrier to local deployment is high enough. Language models currently satisfy both conditions. Text-to-image falls short somewhat. Text-to-video is still changing—hard to say.

Better Cloud

Either way, commercializing models means tying them to cloud services. I think this is actually a good thing.

The old cloud was just selling resources and competing on price; no matter how much you spent on technology, it was hard to differentiate. With models added to the mix, things change: technical investment can directly reduce inference costs and create pricing headroom. Companies that do this well can actually make good money.

The endgame for model companies is probably becoming cloud companies—not the kind that sells bare metal, but a new type of AI cloud that sells intelligence.

Recommended Reading

Subscribe to Updates

Get notified when I publish new posts. No spam, ever.

Only used for blog update notifications. Unsubscribe anytime.

Comments

or comment anonymously
0/2000