Inside Reflection AI: The $20B Open-Model Startup That Has Yet to Ship

Intro

On March 2, the Financial Times reported that Reflection AI is raising at least another $2 billion, with its potential valuation approaching $20 billion. This follows a rapid series of raises: the company emerged from stealth in March 2025 with $130 million and a valuation of around $545 million, then raised $2 billion at an $8 billion valuation in October 2025, and is now back in the market again. Reflection frames its mission in the language of openness and open science. The awkward part is that, as of early March 2026, the frontier open-weight model at the center of its pitch still has not been released publicly, its code research agent Asimov remains on a waitlist, and the company’s website features product docs and blog posts but no research papers.

As it happened, we had recently spoken with Ioannis Antonoglou, co-founder and CTO of Reflection AI. I came away from that conversation with more doubts than answers, which made Reflection a natural fit for a deep-dive episode in our GenAI Unicorns series: either to dispel those doubts, or to confirm them.

What are they actually building? Why has it all been so slow and secretive if the promise is openness? Is it realistic that their much-discussed open-weight model could outperform the closed labs and Chinese contenders? How are they planning to make money? Is government demand alone enough to sustain the business? There is a lot to unpack.

Before we dive into Reflection, let’s do some events planning. We’re heading to NVIDIA’s GTC. Here are a few sessions where you can catch us (and if you can’t be there in person, register online for free and see all of them):

Physical AI for the Real World: A Vision From NVIDIA Robotics Research with Jim Fan
GPU ♥ LPU: Everything You Wanted to Know with Jonathon Ross
and of course a show that you have to see at least once in your life: GTC 2026 Keynote by Jensen Huang

Arriving early? This hackathon is promising to be super fun: March 15, from 9 am to 10 pm – a builder day for people working on robotics and inference. It’s a chance to get hands-on before GTC, compare what’s actually working in the real world, and build something solid in one day. There’ll be workshops, mentors, a hackathon, humanoid robots, and some very good prizes. It’s fully in person, space is limited, and everyone needs to apply.

Click to register

Now let’s take a look at what’s brewing inside the Reflection AI lab.

In today’s episode:

Starting point of Reflection AI
From autonomous coding to the “missing Western open model”
“$2 billions later…” What Reflection has actually built so far
What “open” means here, and what it does not
Can Reflection realistically beat closed labs and Chinese contenders?
Business strategy: Sovereignty, enterprises, and governments
How big is the sovereign AI market, really?
What would make the strategy more believable
Conclusion
Bonus: Resources

Not a subscriber yet? Subscribe to receive our digests and articles:

Starting point of Reflection AI

Ioannis Antonoglou carries the original “AGI virus” of early DeepMind: the belief that AGI belongs first to science, as a serious research program, before it becomes a product category. He joined Shane Legg and Demis Hassabis as one of the founding engineers in 2012, at a time when working on that kind of thing still seemed almost weird, and when it was nearly impossible to imagine valuations for such startups becoming this silly. He later worked on the now-legendary DQN, AlphaGo, AlphaZero, and MuZero, and eventually on Gemini-related post-training at Google DeepMind. Across his public appearances, and in conversation, he comes across less like a startup salesman than like a mission-driven reinforcement learning lifer who simply never stopped believing that the road to AGI runs through agents that learn by doing.

Misha Laskin is different in temperament, but not in seriousness. Before Reflection, he worked on Gemini at DeepMind, after a Berkeley postdoc and an earlier startup detour. On his personal site he still describes himself as someone interested in how RL can unlock new capabilities in language and multimodal models. According to the public founder mythology, reading AlphaGo paper changed the course of his life. That sounds dramatic, but in AI this is practically a genre.

The two met at Google DeepMind, where Antonoglou was leading post-training efforts for Gemini and Laskin was leading reward model development in RLHF team. They knew extremely well how to train and scale reinforcement learning systems, especially in post-training, reward modeling, and building agents that improve through interaction and feedback.

What they felt was missing was a reliable general-purpose product surface: language models gave breadth, but not depth or dependable autonomy. In their view, the missing piece was combining LLM generality with RL-driven capability to create agents that can actually do useful work end to end. They left comfortable job in Google DeepMind and, in late February-early March 2024, started working on Reflection AI.

From autonomous coding to the “missing Western open model”

The whole year they were in stealth. In March 2025, Reflection AI emerged publicly with this thesis: building superintelligent autonomous systems, starting with autonomous coding. Its official two-step plan was simple: first build a superintelligent autonomous coding system, then use that blueprint to expand to other categories of computer-based work. The logic was straightforward enough to be compelling. Coding is measurable, digital, tool-rich, and close to the operating system of modern knowledge work.

Wonderful plan. Other companies were already shipping exactly that. Anthropic, for example, had just launched Claude Code in February 2025, which quickly became a flagship product and changed the way we think about coding.

You might think they doubled down on openness because Claude Code was not open and DeepSeek, along with other Chinese models, was tearing through the market with high-quality open models. But no. In Lightspeed's announcement about Reflection's Series A, openness didn't feature at all:

❝

Reflection AI is leveraging its deep expertise in reinforcement learning and large language models to solve autonomous coding – and, more broadly, unlock the path to superintelligence.

From Lightspeed blog, March 2025

By the October 2025 Series B, the messaging had shifted:

❝

We're building frontier open intelligence accessible to all.

From Reflection AI’s announcement, October 2025

In our conversation, Antonoglou explained the logic behind this pivot: "If you want to do reinforcement learning at the frontier, you need an extremely powerful base model to post-train," he said. "What we saw – especially with Llama 4 not being a particularly strong model – was that the whole Western ecosystem was missing a powerful open base model that we could even use to do reinforcement learning at scale. We realized this was a gap we were the only ones positioned to fill."

I pressed on whether this meant the target was still a coding model. "It's a general agent model," he said. Why the change? "Because we think it's important that there is a general open model in the West – and there isn't one. So we have to build it."

This pivot might be rational. But It is also dangerous. Rational, because RL-heavy post-training only gets you so far if the base model is weak. Dangerous, because it turns a wedge product into a civilization project. An autonomous coding company can ship product, iterate with customers, and build revenue while improving its models. A company trying to build the missing Western open frontier base model is now competing at the deepest layer of the stack, where the incumbents have more compute, more data, more iteration cycles, and more room for error.

What makes the danger concrete is how candidly Antonoglou described the current state. When I asked about applications, he said, very plainly: "The focus right now is just to build the models. Applications will follow, but it's all hands on deck. Building this model is challenging and requires all of our mental focus." When I asked how it was going, the answer was a non-answer: "There are lots of incredible things we'll have to share this year." This might be a decent research strategy. It is much less obviously a startup strategy.

That tension matters because Reflection’s original public promise was not “trust us, we are building a general open agent model from scratch and the applications can wait.” It was autonomous coding, then superintelligence. Now the company is basically saying: yes, but in order to do autonomous coding properly we first need to build the frontier open model that the rest of the Western ecosystem failed to build. That may turn out to be true. It also means the company has chosen the slowest possible road in the fastest market.

“$2 billions later…” What Reflection has actually built so far

So what has Reflection actually shipped?

They sort of shipped Asimov, launched 7 months ago, in July 2025.

Asimov is described as a code research or code comprehension agent for complex enterprise codebases. Rather than chasing the crowded code generation market head-on, Reflection thought to focus on a messier and more realistic problem: engineers spend huge amounts of time understanding systems, tracing old decisions, and piecing together organizational memory that was never properly documented. Reflection argues that this is the real bottleneck, and they are probably right.

Allegedly, Asimov indexes codebases, architecture documents, GitHub discussions, Teams conversations, Jira tickets, and other sources of engineering context. Reflection’s docs make a point of contrasting this with standard RAG or tool-calling approaches. Their pitch is that most systems decide what is relevant before reasoning begins, while Asimov tries to reason across a much broader context from the start, using large context windows and context caching.

The problem is that it’s March 2026, and the onboarding flow still routes users through a waitlist, which you can’t actually join. If you try it just takes you to the most recent blog post – from October 2025. That incompleteness and lagging started to become somewhat tiring.

What “open” means here, and what it does not

Reflection talks about openness with missionary intensity. Its October 2025 blog says it is building frontier open intelligence accessible to all. The careers page says it is developing open-weight models for individuals, agents, enterprises, and even nation states. The research page frames the project as keeping the foundation of intelligence open rather than captured by a handful of labs. In interviews, Antonoglou argues that openness improves research velocity, external validation, and even safety, because more people can inspect and stress-test the systems. That is the ideological core of the company.

The problem is that the present-tense evidence for openness is weak. Reflection’s public blog currently has only two posts. Its research page mostly highlights papers the team worked on before Reflection, not research artifacts released by Reflection itself. And its verified Hugging Face organization, as of now, shows zero public models and zero public datasets.

5 months ago, Laskin told TechCrunch that Reflection planned to release model weights for public use while largely keeping datasets and full training pipelines proprietary. That puts Reflection closer to the open-weight model of Meta or Mistral than to a fully open effort like Ai2’s OLMo program. There is nothing inherently illegitimate about that. But it is worth being honest about the vocabulary. “Open science” suggests a broader research practice than “we may release weights later while keeping the rest closed.”

Can Reflection realistically beat closed labs and Chinese contenders?

This is the heart of the case.

Could Reflection build a strong open-weight model? Absolutely. The founders are phenomenal researchers, with an exceptionally strong team of people who worked on PaLM, Gemini, AlphaGo, AlphaCode, GPT-class systems, and other frontier projects.

But the current valuation appears to assume something more than that. Reflection is increasingly being discussed as the American answer to DeepSeek and as a challenger at the frontier itself. That is a much higher bar, and it becomes harder to meet with every month the company remains pre-release.

Re: Chinese labs competitor: Reflection does echo part of the Chinese strategy: putting top talent to work on open models. But Chinese labs are shipping constantly, sometimes every month, sometimes every week, and publishing a tremendous amount of research along the way. They also have different options, including IPO paths, to sustain themselves commercially.

Re: frontier competitor: By March 2026, the closed labs had not stood still. Depending on how you count a “model iteration,” both OpenAI and Anthropic had shipped at least five major frontier updates in roughly a year with no real contestants from open-source. They also work hard on their coding agents and systems: Anthropic’s Claude Code run-rate revenue had surpassed $2.5 billion. OpenAI reports fast user growth for Codex etc. The point is, these companies are not only improving models, they are turning them into habits, workflows, and revenue machines.

Re: broader open-source competition: The open side of the market is also more crowded than Reflection’s rhetoric suggests. Ai2 released OLMo 2 32B in March 2025 and OLMo 3 in November, both as part of a genuinely fully open effort rather than a weights-only compromise. OpenAI released gpt-oss in August 2025 under Apache 2.0, with open-weight reasoning models and strong tool-use performance. NVIDIA introduced the Nemotron 3 family as open models, data, and libraries for agentic AI.

I would argue they are already significantly behind, trying to enter the market with rhetoric more than delivery. Every serious player in this race has an economic engine behind it. Closed labs have products and revenue. Chinese labs have multiple ways to sustain aggressive open-model development. Even the strongest open efforts sit inside institutions with clear strategic backing. Reflection, by contrast, still appears to depend mainly on more fundraising. Without a released model, a real product, or a convincing business strategy, that is a fragile place to be.

Business strategy: Sovereignty, enterprises, and governments and mass adoption

Government and enterprise

In TechCrunch, Laskin said revenue would come from large enterprises building on top of Reflection’s models and from governments developing sovereign AI systems. Reflection’s materials repeatedly stress self-hosting, infrastructure control, privacy, and ownership of the AI stack. In our conversation, Antonoglou said: enterprises and governments want full control over their AI.

This argument has gained traction in the post-DeepSeek era. “Sovereign AI” has become shorthand for organizations wanting control over their data, compute, and deployment environments rather than depending on external APIs.

Antonoglou pointed to Chinese labs as proof that open models can coexist with strong commercial success. But the comparison is not entirely clean. Chinese labs operate in ecosystems that make commercialization of open models easier: large domestic platforms, different capital structures (like recent MiniMax’s IPO), and more direct pathways to scale adoption, stronger government support. Reflection does not yet have those structural advantages.

So the deeper question becomes whether sovereignty alone is enough to sustain the business. Government and sovereign deals can be large and politically attractive, but they are also slow, uneven, and tied to procurement cycles. And if government is their main customer, we might see Reflection becoming a research arm of the US state rather than a broadly adopted commercial AI company. At least, secrecy in that case would be fully justified.

Mass adoption

Antonoglou himself said the only metric that truly matters is adoption. That’s true. A $20-billion story cannot rest only on patriotic symbolism. It requires real deployment, repeated usage, and evidence that customers keep coming back.

But there is another problem with open models now: even committed open-source advocates don't actually use open models in their daily work. Antonoglou's response was: "People don't use something that's not as powerful. Why would they? People just want to use the best models out there. So you want to ensure that genuinely competitive models – models actually close to the frontier – are open. Until we have that, you'll always see this mismatch between what people want to see and what's actually happening in reality."

Precisely. Unfortunately, that also demonstrates that part of business strategy rests on a model that does not yet exist. The argument isn't "open models are better today." It's "open models would be better and people would use it more if someone built a good enough one — and we're the ones who will."

Reflection’s ambiguity here becomes a real risk. The company says it has identified a scalable commercial model aligned with its open-intelligence strategy, but the public details remain thin. There is still no released foundation model, no public pricing, and no clear evidence of adoption at meaningful scale. That does not mean there is no business. It means the case for the business still relies largely on inference.

How big is the sovereign AI market, really?

It depends on what you count. On the narrow end, Gartner estimates sovereign cloud IaaS spending will reach $80 billion in 2026, which is the cleanest hard number for infrastructure directly tied to sovereignty requirements. On the broad end, McKinsey argues that 30% to 40% of global AI spending could be shaped by sovereignty requirements, implying a $500 billion to $600 billion market by 2030 across compute, data, models, platforms, and applications. In other words, sovereign AI is already a real market, but most of the money is likely to go first into infrastructure, cloud, and deployment, not into model vendors alone.

For Reflection, the realistic target is not “capture sovereign AI” in the abstract. It is to become a serious model supplier for sovereign deployments: governments, regulated enterprises, and national or regional AI stacks that want open-weight models with more control than OpenAI or Anthropic offer. If Reflection executes well, a plausible outcome is a meaningful slice of the software-and-model layer inside that much larger infrastructure market, not the whole stack. The main contenders are DeepSeek and Qwen on the Chinese open-model side, plus Mistral, Ai2/OLMo, NVIDIA’s very strong open-model ecosystem, and the closed labs if buyers decide sovereignty matters less than performance.

What would make the strategy more believable

If Reflection wants the outside world to treat it as more than an expensive geopolitical thesis, it probably needs to do four things.

First, ship something public, soon. Not necessarily the final crown-jewel model, but something concrete enough to let the ecosystem evaluate the company’s actual model quality, release cadence, and taste. Their own research. Right now Reflection is living in the awkward zone where it is too advanced to be judged like a seed-stage startup and too unreleased to be judged like a frontier lab.

Second, do something with Asimov. The enterprise context problem Reflection identified is real and still under-served. Engineers do spend a huge amount of time reconstructing systems, decisions, and organizational memory. It’s simply weak to keep it on the website as a waitlist product without even a real signup path. If it matters, make it live. If it doesn’t, avoid sloppiness and just drop it.

Third, narrow the rhetoric. Reflection does not need to promise that its first frontier release will outrun the best closed labs and the best Chinese open labs. It needs to prove that it can release a genuinely competitive open-weight model that buyers can trust, customize, and deploy. Trying to sound like the inevitable best lab in the world while spending months without releasing or publishing anything tangible starts to read as either arrogance or delusion, or both. It just doesn’t sound as a serious business.

Fourth, turn the government thesis into actual business. If government is part of the story, make it part of the business. The policy tailwinds are clearly there: the administration has publicly backed a stronger US open-model ecosystem, and the FT says Reflection has already been discussed as a possible alternative to closed-model vendors in federal contracting. If that market is real, we need to see it soon.

Conclusion

Reflection AI may still work. The founders are unusually credible. The capital is real. The need for strong Western open-weight models is real. The sovereignty pitch is workable. And if anyone has a shot at building a reinforcement-learning-heavy open frontier lab outside the usual giants, Reflection is on the short list. But they are too slow and too secretive, and so far their public statements don’t have any tangible proofs.

Right now Reflection is asking the market to believe four things at once: that open weights can catch closed labs on capability, that reinforcement learning will be the decisive unlock, that sovereign and enterprise demand will finance frontier training, and that a startup can pause product focus long enough to build a frontier model from scratch while faster rivals keep shipping. Any one of those could turn out true. All four together is a high-wire act.

When I asked Ioannis when we would see something coming from Reflection AI, he said: “You will need to wait for it.”

In the current AI race, that doesn’t sound like the right answer.

How was it?

Thank you for reading and supporting Turing Post 🤍 We appreciate you