Google AI Search: When Search Becomes Software

Share Turing Post with one person. You will help us grow

This Week in Turing Post:

Wednesday / AI 101 series: From Tokens to Answers: What Actually Happens During LLM Inference
Friday / We continue our The Org Age of AI series.

Google AI Search is shifting from a system that retrieves information into a system that can generate interfaces, execute tasks, and build lightweight software in response to user intent. Instead of only returning links, Search is becoming an interactive execution layer built on AI agents and generative UI. Let’s discuss

To the main topic → I was invited to cover Google I/O, here is what stood out for me:

Google is changing what the search box is.

When I say it like this, it sounds casual. But I just can’t get over it.

This may be the second great transformation of Internet Search itself. The first one changed how people found information online. You typed a query, Google ranked the web, and the answer came back as links. That simple interface reorganized the internet: publishing, advertising, commerce, knowledge work, even how we think about what it means to “know” something. That’s when “googling” became a word in everyone’s vocabulary.

Now Google is changing the result itself. Search is no longer only a place where you ask and receive information. It is becoming a place where a question can turn into a custom interface, an interactive widget, a specialized layout, a visual explanation, a tracker, a planner, or a small tool built on the fly – the things that builders are currently trying to build with hundreds of agents might eventually converge into something as simple as a search box in Chrome. It feels important.

How it will work: Google is integrating the agentic coding capabilities of Gemini 3.5 Flash and its Antigravity framework directly into Search. In the demo, Search was able to plan the answer, decide what components were needed, research the topic, write code, and run it in a secure containerized environment. Google called this “agentic coding at the scale of Search” and said generative UI with Antigravity is coming to Search this summer, free of charge.

That is mind blowing, and I am not that easy to impress.

We are used to thinking of Search as a doorway to the web. Google is now trying to make it an execution layer on top of the web. That shift has enormous consequences. For users, it could make complex information easier to understand and act on. For developers, it changes what “building” means, because many small tools may be generated at the moment of need. For publishers and websites, it raises a harder question: what happens when the user interacts mainly with the generated layer while the open web becomes the source material behind it?

I don’t think we are fully realizing what this means at search scale. Agentic coding inside an IDE is already a big deal. Agentic coding inside Search is something else. Search is one of the most familiar interfaces in the world. If billions of people can use it, if it becomes intuitive and natural, then building becomes a much more ordinary human action.

Of course, many questions remain. Generated interfaces can be wrong. Personal context can become too invasive. Agents need constraints, permissions, and accountability. The web economy may be reshaped again.

It may simply not work well enough. Too many connections, too many surfaces, too much context, too many places where things can break. Google already has an enormous number of products, and bringing them into one coherent agentic layer is not a small technical or organizational problem.

But if Google can make it work, the direction is extremely interesting. Google once changed how we find information. Now it is trying to create a way to build with information.

I still find it hard to fully imagine: Search becoming a place where intent turns into software. But I’m excited to see how it will work. If it is, of course.

Topic #2 is about builders and building companies.

Speaking of builders, I had a terrific conversation with Eric Ries, who coined the term MVP, minimum viable product, and wrote The Lean Startup, the book that changed how startups build and test ideas. His new book, Incorruptible, asks a different question: how do you build companies that stay coherent, trustworthy, and human? And why love and trust should become part of builders’ language. You should watch it →

If any of those thoughts resonate with you – share them across your social networks. Let’s keep the conversation going.

Twitter Library

13 Open-Source Tools for Foundation Model Deployment

A practical guide to open-source tools for deploying, serving, and running foundation models, from local LLMs to high-throughput production inference.

Turing Post • Ksenia Se

Follow us on 🎥 YouTube Twitter Hugging Face 🤗

News from the usual suspects ™

Tweet that blew everyone’s mind

— # (#)

Cerebras had a huge Nasdaq debut, showing that the market still wants credible compute stories. Agents are token-hungry, and the economics of inference are becoming central to the whole AI race.
Cursor unveiled Composer 2.5, calling it its “most powerful model yet” – smarter on long-running coding tasks, better at following complex instructions, and allegedly up to 10x more efficient than peers. The company says the model builds on Moonshot’s open-source Kimi K2.5 and uses reinforcement learning across sprawling token rollouts.
Cursor also revealed a partnership with SpaceXAI to train a much larger model on “Colossus 2” infrastructure with a million H100-equivalent GPUs. Silicon Valley’s favorite pastime remains unchanged: casually mentioning compute budgets that rival small nations. Elon Musk said: “Opus 4.7 is still better than Composer 2.5, albeit a lot more expensive. Cursor is however an important piece of the puzzle to make Grok much better.”
xAI moved Grok closer to builders. Grok Build, now in early beta, brings Grok into the terminal as a coding agent with planning, diffs, plugins, hooks, skills, MCP servers, and parallel subagents. This is the same direction everyone is chasing: agents that can work inside real developer environments.
Thinking Machines showcased Interaction Models, a real-time multimodal AI system designed to behave less like a prompt box and more like a collaborator. Instead of the usual turn-based “you speak, I speak” rhythm, the model continuously processes audio, video, and text in 200ms “micro-turns,” enabling interruptions, simultaneous speech, visual awareness, and live tool use. Very impressive.
Also →

— # (#)

OpenAI moved further from model provider to deployment company. It created OpenAI Deployment Company, backed by major investment, and is acquiring Tomoro, an AI consulting firm. The message is clear: the bottleneck is no longer model access. It is workflow redesign inside organizations.
Anthropic kept pushing Claude into business infrastructure. Claude for Small Business connects to tools like QuickBooks, PayPal, HubSpot, Canva, DocuSign, Google Workspace, and Microsoft 365. Anthropic also expanded its PwC partnership, with Claude Code and Claude workflows going deeper into professional services. Claude’s positioning is becoming very clear: trusted AI for institutional work.
Meta focused on AI inside messaging and social surfaces. It introduced Incognito Chat for Meta AI in WhatsApp, added more teen supervision tools, and continued pushing Business AI on WhatsApp. Meta’s advantage is distribution. Its challenge is permission, privacy, and trust.
Isomorphic Labs raised $2.1 billion to scale AI drug discovery. This is one of the strongest signals that AI-for-science is moving from impressive research demos toward industrial pipelines. The question is whether AI-native drug design can become a repeatable engine for pharma.
Alibaba tied Qwen more directly to cloud and commerce. AI is becoming a larger share of Alibaba Cloud revenue, while Qwen is also being used as a conversational interface for Taobao and Tmall. The model is not the destination here. It is the interface to shopping, logistics, ads, and cloud consumption.
Mistral sharpened the sovereign AI argument. Arthur Mensch warned that Europe has about two years to avoid dependence on U.S. AI infrastructure. His point was practical: sovereignty means controlling chips, energy, and compute capacity.
Google at a glance. Google announced one big thing across many products: AI is becoming agentic, multimodal, and embedded into every surface.

Theme	Main announcements
Scale	3.2Q tokens/month across Google services; 900M Gemini app users; AI Mode passed 1B monthly users
Models	Gemini 3.5 Flash launched; 3.5 Pro coming next month; Gemini Omni announced for multimodal generation/editing
Search	New AI Search box; AI Overviews + AI Mode merged; Search agents; generative UI; custom mini-apps, trackers, dashboards
Agents	Gemini Spark for 24/7 background tasks; MCP support coming; Chrome agentic browsing; Android Halo later this year
Coding	Antigravity 2.0; CLI/SDK/voice; subagents, hooks, async tasks; OS built by agents demo
Workspace	Docs Live voice drafting; voice coming to Gmail and Keep; Spark coming to Workspace and Enterprise
Commerce	UCP expanded; AP2 for controlled agent payments; Universal Cart across Search/Gemini, later YouTube/Gmail
Creative tools	Google Pics; Stitch updates; Flow gets Omni, agents, custom tools, music remixing
XR	Android XR glasses; audio glasses this fall with Samsung, Gentle Monster, Warby Parker
Trust & safety	SynthID expansion; Content Credentials in Search/Chrome; CodeMender API
Science	Gemini for Science; AlphaEarth, WeatherNext, Isomorphic Labs updates

This is much clearer because it says: Google announced one big thing across many products: AI is becoming agentic, multimodal, and embedded into every surface.

🔦 Research Highlight

Code as agent harness

Researchers from University of Illinois Urbana-Champaign, Meta and Stanford University survey how code is becoming the operational harness for AI agents, not just their output. They organize the field into three layers: interfaces for reasoning, action, and environment modeling; mechanisms for planning, memory, tools, feedback, and optimization; and multi-agent scaling through shared artifacts for coordination and verification. Applications include coding assistants, GUI/OS automation, embodied agents, science, personalization, DevOps, and enterprise workflows →read the paper

No one knows the state of the art in geospatial foundation models

Researchers from Taylor Geospatial, Technical University of Munich, Microsoft AI for Good, Allen Institute for AI, Vector Institute, Clark University, University of British Columbia, and Arizona State University audited 152 geospatial foundation model papers (2019–2025) and found severe comparability failures. They identified 46 cases where identical model-benchmark protocols differed by ≥10 points, including Scale-MAE scoring 33.0 vs 89.6 on NWPU-RESISC45. Across 401 benchmarks, 35% of papers shared none of the top-10 datasets. Additionally, 39% released no model weights, 75% used unique pretraining setups, and 94/126 papers had incomparable pretraining configurations. The study proposes six standards: mandatory licensed weight release, shared benchmark suites, rerun/copied baseline labeling, variance reporting, unified evaluation harnesses, and disentangled architecture-versus-data ablations →read the paper

Models (a lot of great open weights models last week)

Research

Trends we see looking at every paper related to AI and ML published last week:

Agentic systems, memory, and autonomous workflows

🌟 δ-mem: Efficient Online Memory for Large Language Models – treats memory as a compact online state coupled directly with attention, instead of relying on longer context windows or external retrieval →read the paper
🌟 MetaAgent-X : Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning – attempts to optimize multi-agent coordination directly instead of hand-designing collaboration workflows →read the paper
🌟 Look Before You Leap: Autonomous Exploration for LLM Agents – studies exploration strategies for agents operating beyond scripted environments, which becomes critical once agents leave demos and enter open-ended systems →read the paper
MMSkills: Towards Multimodal Skills for General Visual Agents – pushes agents toward reusable multimodal capabilities, a direction that matters more than endlessly scaling single monolithic agents →read the paper

Reasoning, RLVR, and test-time intelligence

Nudging Beyond the Comfort Zone: Efficient Strategy-Guided Exploration for RLVR – improves exploration in RL with verifiable rewards, one of the central bottlenecks in reasoning-focused post-training →read the paper
🌟 Learning from Failures: Correction-Oriented Policy Optimization with Verifiable Rewards – turns failed reasoning trajectories into structured optimization signals instead of discarding them as noise →read the paper

World models, embodied AI, and spatial intelligence

🌟 Actionable World Representation – argues that world models should optimize for actionable representations rather than passive realism, which is a meaningful conceptual shift →read the paper
Learning POMDP World Models from Observations with Language-Model Priors – combines language priors with partially observable world modeling, directly relevant for embodied agents operating under uncertainty →read the paper

Efficiency, inference, and architecture optimization

🌟 CompactAttention: Accelerating Chunked Prefill with Block-Union KV Selection – targets one of the real operational bottlenecks in long-context inference: expensive prefill computation →read the paper
EndPrompt: Efficient Long-Context Extension via Terminal Anchoring – proposes a lightweight mechanism for extending effective context length without brute-force scaling →read the paper
SNLP: Layer-Parallel Inference via Structured Newton Corrections – explores layer-parallel inference, a potentially important departure from strictly sequential decoding →read the paper
GQLA: Group-Query Latent Attention for Hardware-Adaptive Large Language Model Decoding – adapts attention behavior to hardware-aware decoding constraints, increasingly important for inference economics →read the paper

Interpretability, controllability, and internal representations

🌟 Targeted Neuron Modulation via Contrastive Pair Search – investigates precise neuron-level interventions through contrastive search techniques →read the paper
Sparse Autoencoders enable Robust and Interpretable Fine-tuning of CLIP models – combines sparse autoencoders with multimodal fine-tuning to improve interpretability and robustness simultaneously →read the paper
🌟 Forgetting That Sticks: Quantization-Permanent Unlearning via Circuit Attribution – explores whether unlearning can survive quantization and deployment instead of disappearing after compression →read the paper
MixSD: Mixed Contextual Self-Distillation for Knowledge Injection – injects knowledge through contextual self-distillation rather than expensive retraining pipelines →read the paper

That’s all for today. Thank you for reading! Please send this newsletter to colleagues if it can help them enhance their understanding of AI and stay ahead of the curve.

FOD#153: Agentic coding in search – What it even means?

This Week in Turing Post:

Twitter Library

News from the usual suspects ™

🔦 Research Highlight

Models (a lot of great open weights models last week)

Research

How did you like it?

Reply

Keep Reading