OpenAI announces 80% price drop for o3, it’s most powerful reasoning model

Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more
Good news, AI developers!
OpenAI has announced a substantial price cut on o3, its flagship reasoning large language model (LMM), slashing costs by a whopping 80% for both input and output tokens.
(Recall tokens are the individual numeric strings that LLMs use to represent words, phrases, mathematical and coding strings, and other content. They are representations of the semantic constructions the model has learned through training, and in essence, are the LLMs’ native language. Most LLM providers offer their models through application programming interfaces or APIs that developers can build apps atop of or plug their external apps into, and most LLM providers charge them for the privilege at a cost per million tokens).
The update positions the model as a more accessible option for developers seeking advanced reasoning capabilities, and places OpenAI in more direct pricing competition with rival models such as Gemini 2.5 Pro from Google DeepMind, Claude Opus 4 from Anthropic, and DeepSeek’s reasoning suite.
Announced by Altman himself on X
Sam Altman, CEO of OpenAI, confirmed the change in a post on X highlighting that the new pricing is intended to encourage broader experimentation, writing: “we dropped the price of o3 by 80%!! excited to see what people will do with it now. think you’ll also be happy with o3-pro pricing for the performance :)”
The cost of using o3 is now $2 per million input tokens and $8 per million output tokens, with an extra discount of $0.50 per million tokens when the user enters information that has been “cached,” or is stored and identical to what they provided before.
This marks a significant reduction from the previous rates of $10 (input) and $40 (output), as OpenAI researcher Noam Brown pointed out on X.
Ray Fernando, a developer and early adopter, celebrated the pricing drop in a post writing “LFG!” short for “let’s fucking go!”
The sentiment reflects a growing enthusiasm among builders looking to scale their projects without prohibitive model access costs.
Price comparison to other rival reasoning LLMs
The price adjustment comes at a time when AI providers are competing more aggressively on both performance and affordability. A comparison with other leading AI reasoning models illustrates how significant this move could be:
Gemini 2.5 Pro Preview, developed by Google DeepMind, charges between $1.25 and $2.50 for input depending on prompt size, and $10 to $15 for output. While its integration with Google Search offers additional functionality, that service carries its own cost — free for the first 1,500 requests per day, then $35 per thousand requests.
Claude Opus 4, marketed by Anthropic as a model optimized for complex tasks, is the most expensive of the group, charging $15 per million input tokens and $75 for output. Prompt caching read and write services come at $1.50 and $18.75 respectively, although users can unlock a 50% discount with batch processing.
DeepSeek’s models, particularly DeepSeek-Reasoner and DeepSeek-Chat, undercut much of the market with aggressive low pricing. Input tokens range from $0.07 to $0.55 depending on caching and time of day, while output ranges from $1.10 to $2.19. Discounted rates during off-peak hours bring prices down even further, to as low as $0.035 for cached inputs.
In addition, independent third-party AI model comparison and research group Artificial Analysis ran the new o3 through its suite of benchmarking tests on various tasks, and found it cost $390 to complete them all, versus $971 for Gemini 2.5 Pro and $342 for Claude 4 Sonnet.
Narrowing the cost vs. intelligence gap for developers
OpenAI’s pricing move not only narrows the gap with ultra-low-cost models like DeepSeek but also puts downward pressure on higher-priced offerings like Claude Opus and Gemini Pro.
Unlike Claude or Gemini, OpenAI’s o3 also now offers a flex mode for synchronous processing that charges $5 for input and $20 for output per million tokens, giving developers more control over compute cost and latency depending on workload type.
o3 is currently available through the OpenAI API and Playground. Users with balances as low as a few dollars can now explore the model’s full capabilities, enabling prototyping and deployment with fewer financial barriers.
This could particularly benefit startups, research teams, and individual developers who previously found higher-tier model access cost-prohibitive.
By substantially lowering the cost of its most advanced reasoning model, OpenAI is signaling a broader trend in the generative AI space: premium performance is quickly becoming more affordable, and developers now have a growing number of viable, economically scalable options.