If Turing Post is part of your weekly routine, please share it with one smart friend. It’s the simplest way to keep the Monday digests free.
This Week in Turing Post:
Monday-Thursday – ☕️ I’m in an Francisco, attending and moderating two sessions at HumanX – come say hi if you are there. ☕️
Wednesday / AI 101 series: Gemma 4 and why many OpenClaw users are now switching to it (basically, an Ode to Open Source!)
Friday / Our new series, The Org Age of AI, is getting a lot of traction. This time, we’ll discuss how small teams (1–10 people) should think about organizational questions.
To the main topic →
/dreaming status
/dreaming on
/dreaming off
/dreaming help
Three people you absolutely have to follow if you’re curious about OpenClaw: Peter Steinberger (of course), plus two others who consistently share valuable insights – legendary Dave Morin and the always fun (and helpful!) Vincent Koc.
Them and a huge team of maintainers leads OpenClaw as a movement. Last week they introduced a whole list of incredible updates to the lobster (you’ll find them in the news section) but there is other thing that caught my attention.
Two stories about the inner life of AI that deserve to be read together
Yesterday, Dave published an article, “A Theory of Mind in Three Files,” and what he describes there is striking because it reaches for an old vocabulary to explain a very new kind of system: SOUL.md for identity, MEMORY.md for experience, DREAMS.md for integration.
I think that is exactly what makes OpenClaw so popular – this language, this softness, in a way. I call it kindness. Soul? Dream? We know exactly that in no way do they mean sentience, and that is what makes the language around OpenClaw so unpretentious and real. I am dedicating a whole editorial to that because I think kindness and soul are important for us, humans, if we want to stay human.
Just to prove that there is no BS about sentience, you only need to look at the lobster documentation. It is much more concrete and much less poetic: dreaming is an opt-in background memory consolidation system that sorts recent signals, promotes durable ones into long-term memory, and can write a human-readable dream diary. In other words, the metaphor is doing real interface work. It is translating machine maintenance into a language humans can immediately grasp. And be happy with it.
Somehow it makes it very clear: OpenClaw serves people because it’s a clever system organized by humans. There is no Terminator script behind it. It feels nice.
What I found especially interesting was the coincidence that the same company famous for publishing work that often nudges the sentience conversation forward (a view I do not support) published in the same week a very grounded paper Emotion Concepts and their Function in a Large Language Model. Anthropic’s argument in this work is actually quite careful: they are not claiming that Claude feels emotions or has subjective experience. They are claiming something both narrower and, in practical terms, more useful. They found internal representations of emotion concepts in Claude Sonnet 4.5 and showed that these patterns can causally influence the model’s behavior. Anthropic explicitly says that this does not imply subjective experience, but that reasoning about those representations with the vocabulary of human psychology can still be informative.
And this is where it becomes interesting in a different way.
OpenClaw is not just choosing nicer words. It is setting a tone. It uses human language, but in a way that removes the temptation to overinterpret. It gives you “soul” and “dream,” but immediately anchors them in something you can inspect, edit, and understand. You are not asked to believe in anything mysterious. You are invited to look at files.
That significantly influences the discourse.
By being gentle and human in how it presents these ideas, OpenClaw actually pushes the conversation in a more technical direction. It lowers the temperature. It replaces speculation with inspectability. It shows that you can talk about identity, memory, even something like “dreaming,” without drifting into mythology.
I appreciate it. I think we still don’t fully know how to talk about, or even think about, models and agents and their increasing presence in our lives.
We are going to spend more and more time with systems that stay with us, adapt to us, and become part of everyday life. So it is important to learn how to relate to them without falling into doom scenarios.
OpenClaw shows that this is possible. And in doing so, it may be pushing the rest of the field in a more grounded direction.
Topic 2: In this Attention Span I’m talking about the model that surprised me recently (and you might not have heard about it!) (based on the “Can LLMs Use Real-World Tools?” report by TheFocus.AI). Watch the episode!
Follow us on 🎥 YouTube Twitter Hugging Face 🤗
Twitter Library
We are reading/watching/learning:
From Hierarchy to Intelligence by Jack Dorsey. It’s a good reading along The Org Age of AI, we will be discussing this article in more detail in the future episodes.
Andrej Karpathy’s tweet “LLM Knowledge Bases“ has 16 million views. And for a reason. Here you can find both that post and Andrej’s update. Will be covering it in more details as well later.
Vibe Maintainer by Steve Yagge
News from the usual suspects ™
OpenClaw – Now With a Studio
Very impressive update: OpenClaw 2026.4.5 arrives with built-in video and music generation, a real /dreaming feature, structured task progress, better prompt-cache reuse, and support for 12 more languages across its UI and docs. It also expands its model bazaar: image generation via Comfy, fal, Google, MiniMax, and OpenAI; music through Comfy, Google, and MiniMax; video from practically half the industry. A tidy reminder that “bring your own model” can become an empire if left unsupervised.Meanwhile, Anthropic
Declared War on the Harnesses
Users of third-party agent platforms – including OpenClaw – started receiving emails: subscriptions "weren't designed for these usage patterns," and harness usage will be billed separately at undisclosed rates. Boris Cherny confirmed the changes. The timing: OpenAI is actively courting the same developer audience. Anthropic introduced Claude Channels in March as the official integration path. Is this about compute costs or about controlling the agent layer?
goes big on compute (very big)
Fresh off throttling demand, Anthropic is doubling down on supply – signing a multi-gigawatt TPU deal with Google and Broadcom. With revenue reportedly surging past $30B (annualized) and enterprise adoption accelerating, this is less “capacity planning” and more “industrial-scale ambition.” The message is clear: if AI is the new electricity, Anthropic is buying the power plants early.
OpenAI
War Chest, Fully Loaded
OpenAI just secured a staggering $122B round at an $852B valuation. With $2B in monthly revenue and a fast-closing march toward 1B weekly users, this is less “startup” and more sovereign entity. The strategy: more compute, better models, tighter flywheel – repeat until intelligence becomes a utility.Buying the Narrative
OpenAI acquires TBPN, the fast-rising tech talk show, in a move that blends media, influence, and distribution. Crucially, TBPN keeps editorial independence–credibility intact, megaphone amplified. OpenAI is trying to shape the conversation around them. In this cycle, controlling the story may be nearly as valuable as controlling the stack.
Hugging Face is filling the gap
Clem Delangue, HF’s CEO, started publishing production agent traces this week – real workflow logs from actual agentic tasks. This is the actual chain of reasoning a model uses when it plans, revises, and executes a multi-step task. OpenAI, Anthropic, and Google have been hoarding this data. Open-sourcing it changes the competitive dynamics in ways that won't show up in benchmarks for a while.
Perplexity – Taxes, but Make It Autonomous
Perplexity rolls out “Computer for Taxes,” aiming to tame the annual ritual of confusion and mild despair. With modular, up-to-date tax knowledge, it drafts IRS forms, audits professional returns, and even builds custom tools for complex scenarios. In testing, it caught costly human errors – politely proving accountants aren’t infallible. Filing season is here to prove it.X Devs – Turning Twitter into an AI Action Layer
That’s a surprising announcements, because for me that was the only use of Grok – to parse X. But let’s see: The repo effectively turns X into something AI agents can use, not just read. Instead of scraping feeds, models can now post, search, DM, and analyze via structured tools. It is trying to make social platforms programmable surfaces for AI workflows →their GitHubMemPalace – Milla Jovovich Walks Into AI
Yes, that Milla Jovovich. MemPalace, an open-source AI memory system she helped build, is suddenly one of the more unexpected names in the AI conversation. The project promises local-first memory, sharp benchmark results, and zero cloud dependence. In a field crowded with predictable launches from familiar labs, this one arrives like a plot twist—half engineering story, half cultural moment →tweet of her partner
🔦 Models and Agents Highlight
Open-weight models (control, portability, post-training)
Trinity-Large-Thinking by Arcee
Improves multi-turn reasoning and tool orchestration for long-running agent workflows, with a focus on coherence, instruction fidelity, and stability under real production constraints, while remaining fully open-weight for inspection, distillation, and self-hosting →get the modelGemma 4 by Google
Expands the open model stack with multiple sizes and efficient architectures, emphasizing deployability, customization, and lower-cost inference for real-world products rather than centralized API usage → get the modelQwen3.6-Plus
Pushes open models toward near-frontier performance on reasoning and agentic benchmarks, making open-weight systems increasingly viable as primary production backbones →get the model
Closed models (platform-controlled intelligence layer)
By Microsoft:
MAI-Transcribe-1
Turns speech recognition into a native platform capability with production-grade accuracy and latency, positioning transcription as core infrastructure →read their blogMAI-Voice-1
Enables controllable, identity-consistent voice generation, pushing voice toward becoming a primary software interface rather than an output format →read their blogMAI-Image-2
Advances image generation quality and text rendering, integrating visual creation directly into productivity and enterprise workflows →read their blog
GLM-5V Turbo by Z.ai
Extends multimodal capability into interface-level reasoning, enabling models to operate across screens, files, and visual environments within controlled platform ecosystems →read their blog
🔦 Open models are closing the capability gap while keeping control, and closed platforms are moving aggressively to own the core modalities that define how software is experienced.
Research and Survey Highlights
A Survey of On-Policy Distillation for Large Language Models – Unifies distillation under an on-policy framework where models learn from their own generated trajectories, which connects directly to how agent loops and RL-style reasoning systems are actually trained →read the paper
The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

provides a unified and up-to-date landscape of latent space in language-based models →read the paper
Rest of research from the last week
(this week we try to keep it shorter, still working on the best idea how to share research)
Agent systems, infrastructure, and evaluation
Meta-Harness: End-to-End Optimization of Model Harnesses – Formalizes harness engineering as a searchable optimization space by iterating over code, execution traces, and prior system behavior, showing that improving orchestration logic can outperform hand-designed agent pipelines across reasoning and coding tasks →read the paper
Marco DeepResearch: Unlocking Efficient Deep Research Agents via Verification-Centric Design – Centers verification as the core mechanism of research agents, which directly addresses the trust bottleneck in real deployments. →read the paper
Terminal Agents Suffice for Enterprise Automation – Argues that simple terminal-level agents can already cover a large portion of enterprise workflows, which is a very practical and slightly contrarian claim. →read the paper
MiroEval: Benchmarking Multimodal Deep Research Agents in Process and Outcome – Introduces evaluation that captures both process and results, which is essential if research agents are to be taken seriously. →read the paper
Proactive Agent Research Environment: Simulating Active Users to Evaluate Proactive Assistants – Simulates user-driven environments to test proactive agents, moving evaluation closer to real interaction loops. →read the paper
Reasoning, behavior, and control, self-improvement loops
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization – Applies RL to shape reasoning trajectories, making it structurally more important than prompt-level improvements. →read the paper
Reasoning Shift: How Context Silently Shortens LLM Reasoning – Shows that more context can degrade reasoning depth, which directly challenges how most RAG systems are built today. →read the paper
The Model Says Walk: How Surface Heuristics Override Implicit Constraints in LLM Reasoning – Exposes how shallow heuristics dominate reasoning even when deeper constraints are available, which is important for safety and reliability. →read the paper
Brevity Constraints Reverse Performance Hierarchies in Language Models – Demonstrates that evaluation conditions can flip model rankings, which raises uncomfortable questions about current benchmarking practices. →read the paper
Embarrassingly Simple Self-Distillation Improves Code Generation – Shows that re-training a model on its own sampled outputs improves code generation by reshaping token distributions, offering a lightweight alternative to RL and verifier-based pipelines while exposing the precision–exploration tradeoff in decoding →read the paper
World models, environments, and simulation
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models – Addresses temporal memory in video-based world models, which is a key limitation for persistent simulation. →read the paper
EgoSim: Egocentric World Simulator for Embodied Interaction Generation – Builds an egocentric simulation framework, pushing toward embodied agent training environments. →read the paper
VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward – Introduces reward shaping for consistency in generated worlds, which is a step toward controllable simulation. →read the paper
Representation, multimodality, and system-level shifts
LongCat-Next: Lexicalizing Modalities as Discrete Tokens – Reframes multimodal modeling around a unified token interface, which is a strong architectural bet with broad implications. →read the paper
LatentUM: Unleashing the Potential of Interleaved Cross-Modal Reasoning via a Latent-Space Unified Model – Pushes latent-space unification for cross-modal reasoning, reinforcing a shift toward shared internal representations. →read the paper
That’s all for today. Thank you for reading! Please send this newsletter to colleagues if it can help them enhance their understanding of AI and stay ahead of the curve.
Upgrade now

