Turing Post
Posts
AI 101: Everything You Need to Know about GPT OSS

AI 101: Everything You Need to Know about GPT OSS

OpenAI Goes Open – But What’s Really in the Box?

Alyona Vert., Ksenia Se & Will Schenk
August 06, 2025

someday soon something smarter than the smartest person you know will be running on a device in your pocket, helping you with whatever you want.

this is a very remarkable thing.

Sam Altman’s tweet (Aug 5, 2025)

Feels like a plot twist.
Feels like a comeback.
Feels like the beginning of something big.

Clem Delangue on his Linkedin

That’s how Hugging Face CEO Clem Delangue described the GPT-OSS launch, and in a week that also saw Anthropic drop Claude 4.1 Opus and DeepMind unveil the jaw-dropping Genie 3 world simulator, OpenAI still managed to steal the spotlight. They just dropped GPT-OSS, a family of powerful, permissively licensed open-weight models that immediately shot to the #1 trending spot on Hugging Face.

We all know that for the past year, the most exciting action in open-source AI has come from Chinese labs like Qwen, DeepSeek, Moonshot AI and others (see our recent breakdown). That’s why OpenAI’s re-entering the arena is so important. After six months of quiet collaboration with HF, they’ve released two highly capable models that are already shaking up the ecosystem. That’s why we’re also changing our editorial schedule and, instead of time-test compute, covering the one and only GPT-OSS.

The most shocking thing, of course, is that OpenAI finally noticed the “open” part in their name. The most admirable thing is that they’re setting an example for other closed-source American companies. But in the community, there’s also a whirlwind of conflicting takes, that beneath the surface of the "gift" lies a complex and calculated strategy. Is this a genuine return to the open-source ethos? Is it a strategic gambit to reclaim the narrative? Or a trick to lock developers into the OpenAI paradigm? Well, all of it and more.

In today’s episode, we will cover:

The Plot Twist: Why Is OpenAI Releasing Open Models Now?
The Models: What’s Under the Hood of GPT-OSS
Harmony: The New Prompting Standard?
Hands-On: Does It Actually Work (and How)?
Where to use it + Installation
The Performance Paradox: A Spiky, Brittle Genius?
Reclaiming the Open-Source Crown?
Safety, Red Teaming, and Worst-Case Scenarios
GPT-OSS: Official Results (more evals needed)
Conclusion: Your Guide to the GPT-OSS Family
Sources and further reading

The Plot Twist: Why Is OpenAI Releasing Open Models Now?

The story, as told by Clem Delangue, began six months ago when Sam Altman at AI Action Summit in Paris declared they were serious about open source. It was a statement many found hard to believe. OpenAI’s journey from a non-profit research lab to the titan of proprietary AI is the stuff of Silicon Valley legend. Their last major open-weight language model release was GPT-2, an eternity ago in AI time.

So, why the change of heart?

There are a few compelling reasons. First, it was an Action Summit after all – and just in January, DeepSeek R1 gave everyone a good kick in the butt for not acting toward openness. By showing its full chain-of-thought and using a permissive license, DeepSeek set a new standard for transparency and trust in reasoning models. And then, like mushrooms after rain, other Chinese models followed suit. Chinese! What a shame to the states. Action was really needed.

That brings us to the geopolitical angle. With growing calls for strong American open-source AI foundations to counterbalance the momentum from China, who better to deliver than the startup that has led the field? It’s a move to ensure the US remains at the forefront of what has become a global open-source race. Almost weird they didn’t call the model GPT-USA 🇺🇸.

They couldn’t do it alone, though. OpenAI doesn’t exactly enjoy the warmest reputation among developers these days. Hugging Face, on the other hand, is everyone’s darling. It was the obvious partner to turn to. In the world of AI, vibes matter – and Hugging Face brings the kind that makes open source feel like a movement, not a memo. Plus, their strong roots in Europe and support from figures like Yann LeCun made the move even more noticeable.

Third, as Nathan Lambert suggests, this is a strategic “scorched earth policy.” By releasing a powerful open model that undercuts the performance of their own o4-mini API and other competitors, OpenAI could be clearing the lower end of the market ahead of a future GPT-5 release – hoping to capture the premium tier.

The Models: What’s Under the Hood of GPT-OSS

GPT-OSS are large-scale language models based on the same ideas behind GPT-2 and GPT-3 but improved with Mixture-of-Experts (MoE) transformer architecture to activate only a part of the model at any time, saving memory and making training more efficient. Both come with a permissive Apache 2.0 license, a 128k context window, and full access to the chain-of-thought.

A Selective Step Toward Openness

From what we understand: OpenAI did not release the base models. What they’ve shared are the final, instruction-tuned, safety-aligned versions.

This distinction matters. By keeping the base models and training data private, OpenAI is safeguarding its core intellectual property – the foundational elements that underpin its competitive edge. In effect, they’ve provided the car, but not the blueprints to the factory. This approach allows them to foster community engagement and adoption while maintaining control over the essential ingredients of their technology. It’s a strategic decision that signals a measured, rather than absolute, commitment to openness. Which is understandable when you need to make profit.

Open AI presented two versions of GPT-OSS:

GPT-OSS-120B: A large model with 36 transformer layers, ~117 billion total parameters, where only ~5.1B are used per token during each step. It has 128 experts per MoE block.
GPT-OSS-20B: A smaller version with 24 layers, ~21 billion total parameters, and 3.6B parameters active per token. It includes 32 experts per block.

Thanks to quantization that shrank the size of the MoE layers – reducing their precision to about 4.25 bits per parameter – the gpt‑oss‑120B can run efficiently on a single 80 GB GPU. The smaller gpt‑oss‑20B operates on just 16 GB of memory, making it accessible even on edge devices or laptops with 32 GB of RAM (though performance is slower without GPU acceleration). It is a bit jarring to see the hardware gap of 80 GB versus 16 GB – especially since many top-tier laptop models assume ~64 GB for powerful local inference. That said, these models make local deployment and rapid iteration feasible without requiring costly infrastructure.

Let’s look more precisely at the main components of GPT-OSS’s MoE architecture:

Join Premium members from top companies like Microsoft, Google, Hugging Face, a16z, Datadog plus AI labs such as Ai2, MIT, Berkeley, .gov, and thousands of others to really understand what’s going on with AI. Simplify your learning journey 👆🏼

Reply

or to participate.