Turing Post
Posts
FOD#101: 8 Things You Need To Know About Agentic Web

FOD#101: 8 Things You Need To Know About Agentic Web

we discuss Microsoft strategy unveiled through their recent announcement at Microsoft Build

Ksenia Se
May 19, 2025

This Week in Turing Post:

Wednesday, AI 101: hot from the oven – we discuss Gemini’s AlphaEvolve and Codex from OpenAI
Friday, we continue our Agentic Workflow series with one fascinating development

You are currently on the free list. Join Premium members from top companies like Hugging Face, Microsoft, Google, a16z, Datadog plus AI labs such as Ai2, MIT, Berkeley, .gov, and thousands of others to really understand what’s going on with AI →

If you want to support us without getting a subscription – do it here.

Today, we’re at Microsoft Build – a gianormous event for 5,000+ developers and analysts. This year, it feels like Microsoft finally pulled itself together, remembered it’s a platform company, and came up with a cohesive concept that could completely redefine how we interact with technology, build software, and conduct business. They are making over 50 announcements today but what I like is that they all are part of a coherent story: we’re moving confidently into the era of the agentic web – with AI agents stepping into the spotlight as first-class business entities.

The scale is already immense: Visual Studio now boasts 50 million users, and GitHub serves 150 million, with GitHub Copilot already used by over 15 million developers. This foundation is now being supercharged for an agent-driven future.

TL;DR – What ML/AI Engineers Should Know:

Agent identity, access, and governance are starting to take shape in practical ways.
Interoperability is improving through emerging protocols like MCP and A2A.
GitHub is becoming a more complete environment for building, testing, and shipping agents (not good for platforms Cursor!)
Azure AI Foundry is positioning itself as a control center for managing agent memory, orchestration, and debugging.
Agents are moving beyond the cloud – running locally, inside secure orgs, and across the open web.
Open source remains a key pillar, with tools like Copilot in VS Code, WSL, and NLWeb now available.

What’s notable is how structured the Microsoft offering has become. Unlike Google, which often drops awesome shit (ChatGPT suggests “dazzling capabilities”) into the wild and lets developers figure out the rest, Microsoft has bundled its agentic vision into a coherent, open-but-integrated stack.

It doesn’t try to lock you in – they made it clear they are not repeating their mistake with Internet Explorer – but it absolutely aims to seduce you with convenience, consistency, and scale. They even make the design of Coding agent look exactly like Cursor and Claude (which proved great for many). They are enthusiastically building the open agentic web – and these are 8 crucial things you need to know about it:

Agents as First-Class Business & M365 Entities:
Microsoft is explicitly positioning agents as "first-of-their-kind reasoning agents for work." The updated Microsoft 365 Copilot exemplifies this, integrating chat, search, notebooks, create functionalities, and specialized agents like "Researcher" and "Analyst" into a unified experience. With Copilot Tuning, businesses can fine-tune these agents with their unique company knowledge, language, and even brand voice – think of it as a "brand book on steroids you can talk to." The scale is undeniable, with a projected 1.3 billion AI agents by 2028 and over 230,000 organizations already using Copilot Studio to create custom agents (a 130% quarter-over-quarter increase).
Know Your Agents – ID, Please (Centralized Governance):
With increased autonomy, agents’ identity and security becomes critical. Microsoft Entra Agent ID assigns unique, verifiable identities to AI agents created in Copilot Studio and Azure AI Foundry, "analogous to etching a unique VIN number into every new car." This centralizes agent and user management, allowing admins to see all agents and understand their access privileges.
The Foundational Layer: NLWeb, MCP, and Open Protocols:
Just as HTTP and HTML laid the groundwork for the traditional web, Microsoft is championing NLWeb (conceived by R.V. Guha, the mind behind RSS, RDF, and Schema.org, now a CVP at Microsoft) as a foundational technology for the agentic web, allowing any website to create natural language interfaces. Crucially, every NLWeb instance functions as a Model Context Protocol (MCP) server. MCP, akin to HTTP in its simplicity and power, is an open standard designed to allow AI agents and LLM-powered apps to securely and consistently access data and services from diverse sources. Microsoft is throwing its full weight behind MCP, with broad first-party support across GitHub, Copilot Studio, Dynamics 365, Azure AI Foundry, and even Windows 11. This, along with the Agent2Agent (A2A) protocol for peer-to-peer agent communication, fosters an open, interoperable agentic ecosystem. It is big.
Agentic DevOps is Revolutionizing Software Development:
The software development lifecycle itself is being transformed by "Agentic DevOps." The new GitHub Copilot coding agent (now available to everybody and bearing a striking resemblance to interfaces like Cursor and Claude's) can autonomously refactor code, improve test coverage, fix defects, and even implement new features by simply being assigned an issue. This extends to modernizing legacy Java and .NET apps and even includes an SRE agent to autonomously handle live site issues like memory leaks, from triage to root cause analysis and mitigation. Another cool thing: the open-sourcing of GitHub Copilot's AI integration within VS Code.
The "Agent Factory" is Open: Azure AI Foundry & Copilot Studio:
Creating, managing, and scaling these sophisticated agents requires robust platforms. Azure AI Foundry is positioned as a unified "agent factory" to design, customize, and manage AI apps and agents, offering access to over 1,900 models (they call it Bring Your Own Models (BYOM)) and tools for evaluation, monitoring, and memory. Plus – Agent Store where developers can publish agents to millions of Copilot and Teams users.
Collaboration is Key: Human-Agent, Agent-to-Agent, and Teams as a Hub:
The future is collaborative: humans with agents, and agents with each other. Microsoft Teams is being pushed as a "multiplayer" hub for this, with an enhanced Teams AI library supporting MCP and enabling A2A for secure, peer-to-peer communication between agents. Developers can also add semantic memory to their Teams agents.
Democratization & On-Device Power: Windows AI Foundry & Open Initiatives:
The agentic era isn't confined to the cloud. Windows AI Foundry (an evolution of Windows Copilot Runtime, used internally for building features like Recall) is a unified platform for local AI development, making it easier to run AI models, tools, and agents directly on Windows 11. Foundry Local extends this support to macOS. Native support for MCP on Windows will allow AI agents to connect with native Windows apps.
We loved that Microsoft open-sourced the entirety of WSL (Windows Subsystem for Linux), alongside NLWeb and Copilot in VS Code.
Beyond Business: Microsoft Discovery for the Scientific Frontier:
In a move that seems to parallel efforts like Google DeepMind, Microsoft unveiled Microsoft Discovery, a dedicated platform to accelerate scientific breakthroughs. This platform enables scientists to leverage AI agents for:
* Reasoning over knowledge: Utilizing a powerful graph-based knowledge engine (Graph RAG) over public and private scientific data.
* Generating hypotheses: Creating novel research plans and candidate molecules/materials.
* Running experiments: Orchestrating complex simulations and analyses.
John Link, a visibly enthusiastic chemistry lead, showcased a real-world discovery using the platform: a novel, safer, PFAS-free immersion coolant for data centers, moving from concept to lab-verified material. This signals a profound application of agentic AI to tackle humanity's grand challenges. His excitement was truly inspiring.

Microsoft is making a strong case for building the agent-driven future on its stack. Open protocols and open-source tools help grow the ecosystem, but the real pull is its tightly integrated platform. Enterprises now need to think seriously about how this shift will reshape their operations and competitiveness.

Welcome to Monday. Wait… it’s only Monday? Feels like a whole lot has already happened. Tomorrow is Google I/O, though – and they’re cooking something up to impress.

Curated Collections

The following collection is a great addition to our recent explanation of GRPO and Flow-GRPO →

Click to read

Follow us on 🎥 YouTube Twitter Hugging Face 🤗

We are reading/watching

Why We Think by Lilian Weng
When Will We Give AI True Memory?
Interview with Jensen Huang at stratechery
Satya Nadella’s full keynote

News from The Usual Suspects ©

The most interesting news came from Google, which introduced AlphaEvolve, and from OpenAI, which unveiled Codex. AlphaEvolve is an evolutionary coding agent designed to autonomously discover novel algorithms and scientific solutions. Codex is an AI-powered coding assistant that writes, tests, and fixes code, acting as a virtual coworker within ChatGPT. In line with our ongoing exploration of the agentic web and programming tools, we’ll cover both of these releases in detail on Wednesday.

Models to pay attention to:

Seed1.5-VL by ByteDance advances agent-centric multimodal reasoning with a compact vision encoder and MoE-based LLM that outperforms competitors on 38 of 60 benchmarks and leads in GUI and gameplay tasks.
BLIP3-o by Salesforce unifies image understanding and generation by using a diffusion transformer for CLIP feature generation and sequential training that preserves comprehension while enabling generation.
Skywork-VL Reward by Skywork provides multimodal reward signals using a preference-aligned model that improves mixed optimization techniques and sets new standards on VL-RewardBench.
🌟 Aya Vision by Cohere scales multilingual multimodal generation using synthetic annotation and cross-modal merging, outperforming larger models in multilingual visual reasoning tasks.
Behind Maya supports cultural and linguistic diversity in VLMs with multilingual image-text pretraining across eight languages to improve low-resource language understanding.
MiMo by Xiaomi unlocks structured reasoning by combining large-scale pretraining with reinforcement post-training on math and coding, beating much larger models on reasoning tasks.
AM-Thinking-v1 by Beike elevates open-source reasoning to the 32B scale by fine-tuning Qwen2.5 using RL and SFT, resulting in state-of-the-art scores in math and code.
🌟 INTELLECT-2 by Prime Intellect Team trains a reasoning model via globally decentralized reinforcement learning, innovating with asynchronous infrastructure and outperforming the 32B reasoning baseline.
🌟 MiniMax-Speech by MiniMax achieves zero-shot high-quality speech synthesis with a learnable speaker encoder and Flow-VAE, excelling in cloning, emotion control, and multilingual support.
🌟 SWE-1 by Windsurf
Accelerate software engineering workflows by integrating tool-aware reasoning, incomplete state tracking, and timeline-based context for superior agentic coding experiences.

Featured Eval

🌟 PointArena by researchers from University of Washington, Ai2, Anderson Collegiate Vocational Institute is a new evaluation platform that benchmarks multimodal models’ ability to point in images using natural language. It features three components: Point-Bench (982 tasks across five reasoning types), Point-Battle (live pairwise comparisons with 4,500+ votes), and Point-Act (robotic pointing execution). Molmo-72B outperforms others across all stages. Surprisingly, Chain-of-Thought hurts performance. PointArena reveals that precise pointing – not reasoning – best grounds language in real-world tasks. https://pointarena.github.io

The freshest research papers, categorized for your convenience

Going forward, we'll organize research papers by goal-oriented or functional categories to make it easier to explore related developments and compare approaches. As always, papers we particularly recommend are marked with 🌟

IImproving Reasoning Capabilities and Model Alignment

🌟 Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models proposes aligning models with meta-reasoning abilities (deduction, induction, abduction) to improve reasoning reliability and performance → read the paper
🌟 J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning
trains evaluators to produce better judgment through structured reward for thoughtful assessment → read the paper
🌟 The CoT Encyclopedia maps the diversity of reasoning strategies in chain-of-thought generations, helping steer and evaluate model thinking → read the paper
Learning from Peers in Reasoning Models enables collaborative error correction during reasoning by having different reasoning paths share intermediate summaries → read the paper
WorldPM: Scaling Human Preference Modeling establishes scaling laws for preference modeling and shows emergent behavior in large models trained on large-scale preference data → read the paper
Reinforced Internal-External Knowledge Synergistic Reasoning for Efficient Adaptive Search Agent designs agents that decide when to rely on internal knowledge or external retrieval, improving both speed and factuality → read the paper

Optimizing Training, Prompting, and Scaling

🌟 System Prompt Optimization with Meta-Learning learns system prompts that generalize well across tasks by jointly optimizing with user prompts → read the paper
🌟 Parallel Scaling Law for Language Models introduces a new scaling paradigm by parallelizing computation instead of increasing model size or tokens → read the paper
🌟 Insights into DeepSeek-V3 explores how hardware-model co-design helps scale massive LLMs efficiently, with new attention and precision techniques → read the paper
Learning Dynamics in Continual Pre-Training for Large Language Models analyzes and models loss curves in continual pre-training to predict and control training behavior → read the paper
Memorization-Compression Cycles Improve Generalization proposes alternating phases of memorization and compression to improve generalization during training → read the paper

Improving Generative and Visual Models

🌟 QuXAI: Explainers for Hybrid Quantum Machine Learning Models provides an explainability framework for hybrid quantum-classical ML models using feature attributions → read the paper
DanceGRPO: Unleashing GRPO on Visual Generation (Do read our deep dive into GRPO) adapts a reinforcement learning algorithm for diverse image and video generation tasks across paradigms → read the paper
Unified Continuous Generative Model unifies diffusion, flow-matching, and consistency models into a single framework for more efficient generative training and sampling → read the paper.
Depth Anything with Any Prior combines sparse metric priors and dense relative predictions to build highly generalizable monocular depth models → read the paper
MetaUAS: Universal Anomaly Segmentation with One-Prompt Meta-Learning uses pure vision models and a novel one-prompt method for universal anomaly segmentation without language supervision → read the paper

Better Data Selection and Model Evaluation

🌟AttentionInfluence selects reasoning-rich training data by analyzing the influence of attention heads, no supervision needed → read the paper
🌟 MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering introduces a Kaggle-based gym for agent training in real-world ML workflows, supporting reinforcement and supervised learning → read the paper

That’s all for today. Thank you for reading! Please send this newsletter to your colleagues if it can help them enhance their understanding of AI and stay ahead of the curve

Leave a review!

Reply

or to participate.