• Turing Post
  • Posts
  • FOD#127: Breaking: Is Microsoft crossing the chasm?

FOD#127: Breaking: Is Microsoft crossing the chasm?

Ignite tells an interesting story

This Week in Turing Post:

  • Wednesday / AI 101 series: What is Continual Learning? (Including Nested Learning)

  • Friday / Lessons learnt from AIE Code in NYC

In the guide, you’ll learn:

  • A New Testing Playbook to properly test systems that don't follow a predictable path, using "checkpoint verification"

  • When to use a single, capable agent and when to split into a multi-agent design based on domain, governance, or model requirements.

  • How to build agents that capture real value through judgment, rather than simple, rule-based tasks.

Get the 5 field-tested lessons for building production-ready AI agents now!

Our news digest is always free. Click on the partner’s link to support us or Upgrade to receive our deep dives in full, directly into your inbox →

Editorial: Crossing the Chasm

According to Microsoft, Windows runs on 1.4 billion monthly active devices. A huge share of them are mid-level corporate workers, schools, government offices, public agencies and all the unglamorous places where real work happens. These people will not “adopt” AI. They will simply turn on their computers and AI will already be there.

I thought Apple Intelligence would do that. But it’s looking more and more like Microsoft is the one actually bridging the gap between early adopters and early majority.

Who can cross the chasm?

If you're unfamiliar: Crossing the Chasm is a 1991 tech strategy book by Geoffrey Moore, built around the idea that most new technologies die in the gap between early enthusiasts and mainstream users. Early adopters are excited by novelty and promise. The early majority only shows up when it works out of the box, fits into existing workflows, and doesn’t require rethinking their job. Crossing that gap takes more than a better product – it takes distribution, defaults, and trust.

Does Microsoft have all of it?

This email is one day late because of Microsoft Ignite that is booming right now in San Francisco with crazy 20 000 attendance. You might assume I will be talking about the foundry, Azure, etc. But if you read my newsletter, you heard about most of it already when I covered Microsoft Build in May. But what was concept then is now materializing from concept to the actual users.

And that word – user – is the key.

OpenAI was saying they have 800 million weekly users - that’s a big number! but those are all intentional users. People who went to a URL, signed up, and stayed curious. That’s the early adopter ceiling. Microsoft, on the other hand, doesn’t really need your curiosity. If you are using its software, you are hooked.

According to Microsoft, Windows runs on 1.4 billion monthly active devices. A huge share of them are mid-level corporate workers, schools, government offices, public agencies and all the unglamorous places where real work happens. These people will not “adopt” AI. They will simply turn on their computers and AI will already be there.

There is no intention there, but there is gravity.

And Ignite showed the machinery behind that force.

First, Microsoft formalized agents as organizational actors, with identity in Entra and a control plane in Agent 365. Then it locked in the protocol layer: MCP becomes the socket agents use to talk to tools across the Microsoft stack — M365, Dynamics, Power Platform, Dataverse, and crucially, Windows.

Then came the OS itself. Windows is now a habitat for agents. Not just another UI. The Agent Workspace isolates actions. File and settings connectors expose local context. The taskbar becomes a normal entry point. Not a prompt box — a persistent interface.

And on the data side, Work IQ, Fabric IQ, and Foundry IQ start acting like a semantic memory — one that knows who you are, what the org knows, and what can be used. Security and compliance are finally in the loop: Purview and Defender observe agent behavior under policy.

AI stops being a product and becomes part of the system. AI becomes routine.

This is Microsoft’s strength — and it echoes something Ben Thompson wrote years ago: platforms win not because of better UX, but because of default aggregation. AI doesn’t need to win attention. It needs to win distribution. And that’s exactly what Windows, Entra, and MCP are doing.

In the recent interview we discuss in our Attention Span below, Satya Nadella mentioned a new thing: computers for AI. Strip the marketing, and you get a clear architecture:

  • Training computer – Azure with Foundry on top, routing across models, hosting agents, coordinating multi-agent plans, and evaluating cost and quality.

  • Context computer – Fabric IQ, Work IQ, Foundry IQ, a permissioned semantic layer that gives agents a live view of people, data and tasks.

  • Agent computer – Windows, the Microsoft 365 apps, and Edge, where actions actually happen near files and browsers.

  • Governance computer – Entra, Purview, Defender and Agent 365, the identity, policy and audit spine.

Four layers: training, memory, action, law. All aligned on top of an installed base counted in billions.

And this is Satya’s most recent tweet (Anthropic!)

Ignite is a bet that the real margin pool will sit in Foundry, Copilot and Agent 365 + many modals on Azure, where tokens are turned into workflows under enterprise policy.

Not without limitations

There is still a wall. If you already run on Microsoft, the path is short. Agents arrive through Windows Update and tenant policy. If not, switching OSes is a bigger lift. This is where growth intention matters more than another feature race. Microsoft will expand where the switch is cheap and the value is obvious.

Three places to watch:

  1. AI PCs – hardware that ships with NPUs, voice entry, agent workspace enabled, and a straightforward Copilot trial. This is an on-ramp for households and small firms that would never create an OpenAI account.

  2. GitHub – the habit engine for developers. When “assign an issue to a coding agent” becomes boring and reliable, companies that never planned to adopt Microsoft AI will use it because code ships faster.

  3. Collaboration/Acquisitions – Hugging Face seems to be like an ideal partner.

  4. MCP outside the castle – if Atlassian, Salesforce, ServiceNow, SAP or Snowflake expose first-party MCP servers, Teams and Windows turn into a neutral switchboard rather than a trap.

Microsoft’s strategy is clear. Now we watch whether it can flip the setting from “AI as a curious case” to “AI as the default state of work.” Then the chasm will be crossed.

Attention Span: Not intended but today’s Attention Span is also about Microsoft, specifically Satya Nadella being grilled by Dylan Patel (SemiAnalysis) at Dwarkesh’s Podcast. Watch it here

Curated Collections – It’s all about JEPA this week

Follow us on  🎥 YouTube Twitter  Hugging Face 🤗

What are we reading/watching:

News from The Usual Suspects ©

  • Gemini 3 is in the air! And everyone is talking about it but it’s not out yet. Let’s wait and see it in action. Other news →

  • Quora’s Mission Upgrade
    Does anyone still use Quora? Some people do though it’s not really clear to me why in the age of AI and AI slop. But Adam D’Angelo thinks the important thing for Quora now not sharing knowledge (and slop) but nurturing “collective intelligence.” With Poe now handling multi-modal queries in private chats and Quora continuing as the public town square, they’re betting on synergy between humans and AI to accelerate problem-solving, tech progress, and, of course, engagement. Will see how that works.

  • OpenAI releases GPT-5.1, adds tone controls
    GPT-5.1 is out, bringing updates to both Instant and Thinking models. Responses are meant to be more natural, reasoning more adaptive, and tone more customizable. There's a stronger push toward conversational warmth and user control. My personal experience is disappointing. For the first time since ChatGPT launched, I’m thinking of canceling the subscription (I’m on Pro).

    They also published a GPT-5.1 Prompting Guide, which is interesting because it reveals parts of the system prompt – helping you understand the model’s personality better—and gives a few recommendations on how to prompt it efficiently.

  • DeepMind acting in 3D with SIMA 2
    DeepMind introduces SIMA 2, a Gemini-powered agent that can reason, act, and learn in complex 3D virtual worlds. Unlike its predecessor, SIMA 2 doesn’t just follow instructions – it adapts, explains its decisions, and improves through self-play. Tested in games it wasn’t trained on, it shows promising generalization. A step forward for embodied AI, though challenges like long-horizon memory and precision remain.

  • Claude 4.5 Tackles Political Bias – Carefully

    Anthropic released a detailed report on Claude’s political neutrality, introducing a new automated test for “even-handedness.” Claude Sonnet 4.5 scored higher than GPT-5 and Llama 4, performing on par with Grok 4 and Gemini 2.5 Pro. The company 🦋 open-sourced its methodology, aiming to establish shared standards for measuring political bias in AI.

This emoji 🦋 means open-source.

Models to pay attention to:

  • 🦋 NVIDIA open sourced Apollo 
    NVIDIA unveiled Apollo, an open family of AI physics models optimized for industrial simulation in areas like fluid dynamics, electromagnetics, and multiphysics. Apollo integrates transformers, neural operators, and diffusion methods with domain-specific knowledge. Early adopters, including Siemens, Applied Materials, and Northrop Grumman, report speedups up to 500x in simulations. Pretrained models and workflows support custom training, inference, and deployment, enabling real-time digital twins and surrogate modeling across aerospace, semiconductors, and manufacturing →read their blog

  • Marble, a multimodal world model

    Researchers from World Labs have released Marble, a multimodal generative model that creates, edits, and composes 3D worlds from text, images, video, or coarse 3D layouts. It supports iterative editing, fine control via a tool called Chisel, and export to Gaussian splats, triangle meshes, or video. Marble enables multi-image world generation, dynamic expansion, and high-fidelity visual enhancement. It targets creative industries, simulation, robotics, and spatial intelligence applications →read their blog

  • 🦋 Omnilingual ASR, a speech-to-text model

    Researchers from Meta introduced Omnilingual ASR, a suite of models supporting speech-to-text for over 1,600 languages, including 500 low-resource ones. It features a 7B-parameter wav2vec 2.0 encoder and two decoders: CTC and transformer-based LLM-ASR, achieving <10% character error rate in 78% of languages. The system enables new-language onboarding using just a few paired samples. Meta also released the 350-language Omnilingual ASR Corpus and open-sourced all models and tools →read their blog

The freshest research papers, categorized for your convenience

We organize research papers by goal-oriented or functional categories to make it easier to explore related developments and compare approaches. As always, papers we particularly recommend are marked with 🌟.

Highlight:

  • 🌟 🌟 🌟 LeJEPA – most likely the last paper co-authored by Yann LeCun in Meta-FAIR – is the next step for JEPA

Reinforcement Learning & Reasoning Dynamics

  • 🌟 The Path Not Taken: RLVR Provably Learns Off the Principals (by Meta AI, The University of Texas) – characterize RLVR updates as moving off principal directions in weight space, explaining sparse-looking changes via model geometry and contrasting RL’s regime with SFT for future geometry-aware algorithms →read the paper

  • SofT-GRPO: Surpassing Discrete-Token LLM Reinforcement Learning via Gumbel-Reparameterized Soft-Thinking Policy Optimization – reinforce soft-thinking chains with Gumbel-Softmax and reparameterized gradients so continuous reasoning traces slightly outperform discrete GRPO on Pass@1 and substantially on Pass@32 →read the paper

  • 🌟 RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments – scale RL via 400 adaptive, verifiable environments that tune difficulty to the policy, yielding stronger general reasoning than continued RL on static data →read the paper

  • DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation – design RLVR datasets, entropy schedules, and a hard-focus curriculum (Pre-GRPO) that push code-generation models to SOTA performance on LeetCode and Codeforces →read the paper

  • RedOne 2.0: Rethinking Domain-specific LLM Post-Training in Social Networking Services – stage exploratory RL, targeted SFT, and SNS-focused refinement RL to adapt a 4B model to social workloads while preserving robustness and improving data efficiency →read the paper

Architectures, Routing & Compute–Quality Tradeoffs

  • 🌟🌟 Intelligence per Watt: Measuring Intelligence Efficiency of Local AI (by Stanford, Together AI) – benchmark 20+ local LLMs across 8 accelerators and 1M queries using accuracy-per-power (IPW), showing rapid gains in local capability and room for hardware optimization →read the paper

  • Routing Manifold Alignment Improves Generalization of Mixture-of-Experts LLMs – align routing-weight manifolds with task-embedding manifolds so similar tasks share expert patterns, improving MoE generalization via lightweight router finetuning →read the paper

  • 🌟 Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence – retrofit pretrained transformers with depth recurrence and a recurrence curriculum to increase effective depth and beat standard post-training at matched compute →read the paper

  • 🌟TiDAR: Think in Diffusion, Talk in Autoregression (by Nvidia) – fuse diffusion-based drafting and AR sampling inside one structured-attention forward pass to match AR quality while delivering multi-x token throughput and full KV-cache support →read the paper

Distillation, Collaboration & Introspective Verification

  • 🌟 Black-Box On-Policy Distillation of Large Language Models (by Microsoft) – train a student as a generator against a discriminator that distinguishes teacher vs student outputs, enabling on-policy black-box distillation that matches GPT-5-Chat quality for a 14B student →read the paper

  • Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps via Uncertainty Heads – attach tiny uncertainty heads (UHeads) to a frozen LLM and use its hidden states to estimate step-level uncertainty, matching or beating huge PRMs for reasoning verification →read the paper

  • Optimizing Diversity and Quality through Base-Aligned Model Collaboration – route token-level decoding between a base and its aligned variant based on uncertainty and semantic role to jointly boost diversity and quality in open-ended generation →read the paper

Memory, Retrieval & Long-Horizon Context

  • 🌟Beyond Fact Retrieval: Episodic Memory for RAG with Generative Semantic Workspaces (by University of California) – construct a neuro-inspired Generative Semantic Workspace that incrementally builds structured episodic memory, enabling large gains on long-horizon EpBench with fewer context tokens →read the paper

Multi-Agent Systems, Conversation & Open-World Science

  • Adaptive Multi-Agent Response Refinement in Conversational Systems – refine responses with specialized agents for factuality, personalization, and coherence, coordinating them dynamically per query to outperform single-agent refiners on challenging dialogue tasks →read the paper

  • IterResearch: Rethinking Long-Horizon Agents via Markovian State Reconstruction – reformulate deep research as iterative Markovian workspace reconstruction with RL-based Efficiency-Aware Policy Optimization, scaling interactions to 2048 steps and sharply improving long-horizon performance →read the paper

  • The Station: An Open-World Environment for AI-Driven Discovery – simulate an open-world scientific ecosystem where agents read, hypothesize, code, and publish without central coordination, yielding emergent methods and SOTA on diverse scientific benchmarks →read the paper

That’s all for today. Thank you for reading! Please send this newsletter to colleagues if it can help them enhance their understanding of AI and stay ahead of the curve.

How did you like it?

Login or Subscribe to participate in polls.

Reply

or to participate.