If big companies can't make a net return on their LLM token costs, that doesn't mean it's impossible to. In fact this is exactly what you'd expect to happen with a new technology. Incumbents can't use it well, and are replaced by upstarts who can.
Y Combinator co-founder Paul Graham argues big corporations' failure to profit from LLM tokens is a predictable adoption phase
Story Overview
Paul Graham's post frames large companies' difficulty turning LLM token spend into net profit as the usual early friction with any new technology, where established players integrate it awkwardly and newer ones move faster on practical execution. The note aligns with his long-running view that incumbents often lose ground during these transitions rather than capture the upside. Garry Tan's amplification highlights internal skill shortfalls at big firms as the concrete mechanism creating space for startups.
Past tech shifts show the same early mismatch
The argument rests on observed patterns rather than fresh metrics from current LLM deployments, leaving open how long the unprofitable phase typically lasts before adaptation catches up or displacement occurs. Without company-specific cost or return data attached to the post, readers are left weighing whether this cycle will follow prior timelines or diverge because of the scale of token infrastructure already in place.
Execution edges now hinge on closing internal gaps
Tan points to organizational skill shortfalls as the practical barrier, which implies the advantage may sit less with raw model access and more with teams that can embed the tools into existing workflows without layers of coordination overhead. The absence of named examples or measured gap sizes keeps the claim at the level of pattern recognition rather than a testable prediction.
Users are reacting to claims that startups will profit from LLMs where big tech struggles, with many excited by productivity gains and cost efficiencies while others dismiss the ideas as unrealistic or the business models as flawed.
Most Activity
Curiously enough I did office hours today with a startup that cuts companies' LLM token costs by optimizing requests. They can cut costs by about half, which they split with the customer. So the TAM is a quarter of the model companies' corporate revenue. That's a big TAM!
If big companies can't make a net return on their LLM token costs, that doesn't mean it's impossible to. In fact this is exactly what you'd expect to happen with a new technology. Incumbents can't use it well, and are replaced by upstarts who can.
If big companies can't make a net return on their LLM token costs, that doesn't mean it's impossible to. In fact this is exactly what you'd expect to happen with a new technology. Incumbents can't use it well, and are replaced by upstarts who can.
Skill issues at big company means small new ones can eat their lunch
If big companies can't make a net return on their LLM token costs, that doesn't mean it's impossible to. In fact this is exactly what you'd expect to happen with a new technology. Incumbents can't use it well, and are replaced by upstarts who can.

@paulg Or just do this :)

@paulg I also think the endgame of LLM routing looks like a mixture of 80% fine tuned local models/ 20% frontier lab models

@paulg "The very processes and values that constitute an organization’s capabilities in one context, define its disabilities in another context."
-Clayton Christensen

@paulg We are building this with an open source core engine @modelmeld . Approximating TAM as splitting savings with customers is the wrong way to look at it IMO because very hard to validate what costs "would have been". https://github.com/modelmeld/modelmeld

@rickasaurus Their valuations are bets on the probability of this outcome.

@paulg Adapt or die. A tale as old as time.

Um, yeah. I’m not sharing details on my defense. I have haters. When I became vocal about the comparison between how some companies code and how sophisticated hackers exploit, it opened me up to revenge attempts.
I only ever messed with blackhats in places like CryptBB, and they get extra pissed when you fry their system or expose how fragile their setup really is.These Kali kids aren’t used to systems-level exploitation. They’ll go all the way around their ass just to get into Google. They’re tool users, not systems thinkers. That’s the issue with tech now. One generation had to learn things the hard way, which was better for developing real understanding. You had to break things, trace things, rebuild things, and actually understand the system. Then the next generation grows up inside polished products and prebuilt tools. They become productized employees: trained to operate the interface, not understand the machine underneath it and thats how craft gets replaced by workflow.
I haven’t hacked in years except cod cheaters. Little bastards. Change subject.
Tell me more on quantum lab at vandy. Im not far

@paulg The false premise is token costs won't approach software costs.
For some reason, every time you ask chatGPT to solve a Rubiks cube, it regenerates the same code over 8 minutes. Everyone is very wasteful right now.
We invented a new primitive that reduces this cost to 0.

@paulg The fatal mistake is expecting cost reductions in IT. That line item is only going to get bigger as % net sales. All token ROI needs to be in COGS and classically stubborn operating lines, such as legal and leases.

People should be assigned a threshold of tokens, then they will start making better decisions.
It's easy to get so lazy with an LLM and ask it silly things like "make a screenshot of all views of the app, open them and let me know what you think" instead of just looking at the app yourself..
The best employees will be the ones that bring better results with less token usage... it should definitely be a metric..

Hi 👋 Paul , keep all saved 99% tokens from below ⬇️
As someone who builds AI agents every day, token usage quickly turned into a major bottleneck for me
It’s now saving me ~88 million tokens per day (and climbing toward 2B+ monthly)
I’ve fully open-sourced it under MIT so everyone can benefit
Would love your feedback if you give it a try! 🙏

@paulg That is brilliant!

@paulg Half the token bill sounds nice until you realize the real win is getting people to trust a third party with their prompts. Splitting the savings is clever, but the margin’s razor thin.

@paulg Has any dominant incumbent in one era managed to retain dominance in another?

@XTeamPal how does it work, bro? Can I use it in my Claude Code or other agents?

@paulg The 50% cut is real, but the bigger lever is upstream. Most enterprise LLM spend goes to requests that should never reach the model. Fix the routing, add caching, decompose the tasks, and costs drop before you optimise a single token.

@paulg Alternatively, today’s foundation models are replaced by lower cost architectures.