This Week in Turing Post:
Wednesday / AI 101 series: Neuro-symbolic AI
Friday / AI Literacy: co-create with AI
Our news digest is always free. Upgrade to receive our deep dives in full, directly into your inbox.
Apologies for the delay β the whole internet was not super functional yesterday.
My path into machine learning started with the book The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World, written by Pedro Domingos in 2015. If you read Pedro Domingos on Twitter, you might hate him β heβs famously provocative and doesnβt care much about hurting feelings.
Recently, I went to IA Summit organized by Madrona, one of the most forward-thinking VC funds focused on AI. And there he was, Pedro Domingos himself, challenging speakers with sharp questions. βHe might tell me off right on the spot,β I thought β but curiosity overpowered any assumptions, and I asked if I could sit next to him. βYou know,β I said, βmy ML journey started with your book.β It might have melted his heart βIβm not sure β but we had a great conversation, discussing reasoning machines, reinforcement learning, and of course, I asked if he was finally working on the Master Algorithm Iβd been waiting for since reading his book.
βIβm actually very close to publishing it,β he said.
And so, last week, the world was quietly introduced to Tensor Logic β the paper Domingos believes is the closest realization of the Master Algorithm yet. It slipped under the radar, and thatβs exactly why I want to draw attention to it.
While the title sounds abstract, Tensor Logic is a serious attempt to do what Domingos has promised for a decade: to find a common language for all of AI. His argument is simple and radical β neural networks, symbolic logic, and probabilistic reasoning are not different fields at all. They are the same operation written in different notations. Logical rules, he shows, can be expressed as Einstein summations over tensors. In this view, everything from transformers to Prolog programs to Bayesian networks can be built from a single primitive: the tensor equation.
If that sounds theoretical, itβs not. Domingos is proposing a new programming language for AI (and promises to open a repo soon) β one where both learning and reasoning live in the same algebra and run directly on GPUs. In Tensor Logic, a neural layer, a logical rule, and a probabilistic inference step all compile to the same structure. No Python scaffolding, no glue code between symbolic and neural components β just equations. Itβs mathematically elegant and potentially a foundation for the next generation of AI infrastructure.

Image credit: Slides on Tensor-logic.org
The most intriguing part is what he calls reasoning in embedding space. Reasoning in embedding space matters right now because it hits the nerve of where the field is stuck. In Tensor Logic, facts and rules live inside vector embeddings. At low βtemperature,β the system behaves like pure logic β provable, deterministic, and free of hallucinations. As the temperature rises, reasoning becomes analogical: similar concepts borrow inferences from each other. This βlogic-to-analogyβ continuum could bridge the gap between the reliability of symbolic reasoning and the pattern recognition of LLMs.
Thatβs why this paper might become a big deal. Todayβs LLMs are fluent imitators but clumsy reasoners. They produce text without a formal system of truth. Tensor Logic offers a way to give them one β a mathematical substrate for reasoning, not just stohastic pattern complition. Can it become to AI what calculus was to physics? We are yet to see. But it certainly worth exploring.
π€ Want to scale your AV data workflows 10x faster?
Most AV teams can segment LiDAR and camera data, but struggle to iterate quickly, detect rare events, and scale workflows reliably. The teams that succeed combine curation, annotation, and model evaluation all in one place.Β
Encord is the universal data layer trusted by the worldβs leading ADAS & AV teams, like Woven by Toyota and Zipline.Β
We recommend to join Encordβs LiDAR experts on Oct. 28 for a masterclass on how to:
Visualize and curate multimodal data, including LiDAR and radarΒ
Automate 3D segmentation of obstacles with single-shot labeling and object tracking
Create robust, scalable pipelines for model training and evaluation
Join live or sign up below to catch the replay β
Topic 2: With so much out there, attention gets stretched too thin. This time, we are focusing on the overlooked topics in the conversation between Andrej Karpathy and Dwarkesh Patel. AGI Bubble and AI Aristocracy. Watch it hereβ
Links from the editorial:
We are also reading/watching:
Talk with Jensen Huang on AI & the Next Frontier of Growth (video) where he says: βWe went from 95% market share to 0%. I canβt imagine any policymaker thinking thatβs a good ideaβ¦ whatever policy we implemented caused America to lose one of the largest markets in the world to 0%β
βHow China Built a Parallel AI Chip Universe in 18 Monthsβ by AI Supremacy
Ringing Black Hole Confirms Einstein and Hawkingβs Predictions by Simons Foundation
Follow us on π₯ YouTube Twitter Hugging Face π€
Curated Collections β Learning is power
News from The Usual Suspects Β©
Hugging Face launches its Omni Chat. The initial version of HF Chat was somewhat disappointing β when I tried it, it didnβt work as expected. Since then, theyβve completely overhauled the design and functionality, turning it into a powerful router that intelligently selects the most suitable open-source model for each prompt. The new iteration looks great and performs impressively.
Anthropic has introduced Agent Skills, a modular way to enhance Claudeβs capabilities for specific tasks like Excel work or adhering to brand guidelines. Skills are lightweight, portable folders containing code, resources, and instructions that Claude loads only when needed. Developers and teams can now create and manage custom skills across Claude apps, API, and Claude Codeβbringing more structure and precision to AI workflows. Related blog by Simon Willison: Claude Skills are awesome, maybe a bigger deal than MCP
Amazing tutorial: Robot Learning β A Tutorial
Models to pay attention to
DeepSeek-OCR: Contexts Optical Compression
Researchers from DeepSeek AI released DeepSeek-OCR, a vision-text model designed to integrate document understanding into LLMs efficiently. It introduces Contexts Optical Compression, supporting native resolutions from 512Γ512 to 1280Γ1280 and a dynamic "Gundam" mode. Prompts enable markdown conversion, layout parsing, OCR, and visual grounding. Running at ~2500 tokens/sec on A100-40G GPUs, it supports both vLLM and Transformers. The model emphasizes efficient visual-token compression and supports flash attention 2.0 for acceleration. It is open-sourced under MIT license and optimized for OCR-rich tasks across diverse visual layouts βGitHubFantastic (small) retrievers and how to train them: Mxbai-edge-colbert-v0 tech report
Researchers from Mixedbread AI and Waseda University introduced mxbai-edge-colbert-v0, two late-interaction ColBERT models with 17M and 32M parameters. These outperform ColBERTv2 on BEIR despite lower embedding dimensions (48/64). Using ModernBERT backbones, multi-stage training (contrastive pre-training, fine-tuning, distillation), and optimized ablations, the 17M model supports 32k contexts, runs efficiently on CPU, and stores vectors with 2.5Γ less memory. It achieves 0.6405 NDCG@10 on NanoBEIR βread the paperQwen3Guard technical report
Researchers from Qwen introduced Qwen3Guard, a multilingual safety moderation model available in 0.6B, 4B, and 8B sizes, supporting 119 languages. It includes Generative Qwen3Guard for tri-class safety classification (safe, controversial, unsafe) and Stream Qwen3Guard for token-level real-time moderation. Qwen3Guard-Gen achieves state-of-the-art F1 scores on 8 of 14 English benchmarks, surpasses larger models on multilingual tasks, and supports response refusal detection. Stream Qwen3Guard achieves near real-time latency with only ~2-point performance drop, enabling efficient streaming safety interventions βread the paperA2fm: An adaptive agent foundation model for tool-aware hybrid reasoning
Researchers from OPPO developed A2FM, a 32B model integrating three execution modesβagentic (tool-using), reasoning (chain-of-thought), and instant (direct answers). It uses a route-then-align strategy and introduces Adaptive Policy Optimization (APO) for efficiency-accuracy trade-offs. A2FM achieves 13.4% on BrowseComp, 70.4% on AIME25, and 16.7% on HLE. It surpasses 32B peers in cost efficiencyβ$0.00487 per correct answerβcutting cost by 45.2% vs reasoning mode. It ranks top across agentic, reasoning, and general benchmarks βread the paper
The freshest research papers, categorized for your convenience
We organize research papers by goal-oriented or functional categories to make it easier to explore related developments and compare approaches. As always, papers we particularly recommend are marked with π
Reinforcement Learning for Reasoning & Agents
π π π The Art of Scaling Reinforcement Learning Compute for LLMs (by Meta et al.) β model computeβperformance curves, isolate which choices shift asymptotes vs. efficiency, and propose a predictable, scalable RL recipe. This paper marks the turning point where reinforcement learning becomes an engineering science β transforming opaque, trial-and-error reward tuning into a predictable, scalable process that can guide the next generation of reasoning and alignment in large models βread the paper

QeRL: Beyond Efficiency β Quantization-enhanced Reinforcement Learning for LLMs β combine low-precision NVFP4 + LoRA with adaptive quantization noise to speed rollouts, raise exploration entropy, and match full-tune reasoning accuracy at far lower compute βread the paper
Agentic Entropy-Balanced Policy Optimization β balance entropy during rollouts and updates with pre-monitoring and entropy-aware advantages to stabilize long-horizon tool use and improve pass rates on web-agent tasks βread the paper
Attention Illuminates LLM Reasoning: The Preplan-and-Anchor Rhythm Enables Fine-Grained Policy Optimization β identify attention patterns that mark critical tokens and assign targeted RL credit to them for consistent reasoning gains βread the paper
LaSeR: Reinforcement Learning with Last-Token Self-Rewarding β replace costly verifier passes by aligning a last-token self-reward with reasoning rewards, improving RLVR training and test-time scaling at minimal extra inference cost βread the paper
Information Gain-based Policy Optimization β compute dense, intrinsic turn-level rewards from belief updates to overcome reward sparsity in multi-turn agents and improve sample efficiency βread the paper
Demystifying Reinforcement Learning in Agentic Reasoning β distill practical recipes across data, algorithms, and reasoning modes that let small models rival larger ones on agentic benchmarks βread the paper
Stronger Together: On-Policy Reinforcement Learning for Collaborative LLMs β adapt grouped RL to multi-agent roles and turns, scaling cooperative planning, coding, and math accuracy βread the paper
Architectures, Efficiency & Compression
Diffusion Transformers with Representation Autoencoders β replace VAEs with pretrained representation encoders plus trained decoders to yield richer latents, faster convergence, and state-of-the-art DiT image generation βread the paper
Attention Is All You Need for KV Cache in Diffusion LLMs β refresh KV caches adaptively by attention-aware drift tests and depth-aware schedules to accelerate diffusion decoding without quality loss βread the paper
Dr.LLM: Dynamic Layer Routing in LLMs β learn per-layer routers (skip/execute/repeat) from MCTS-discovered paths to cut compute while improving or preserving accuracy across tasks βread the paper
π BitNet Distillation (by Microsoft) β distill full-precision LLMs into ternary-weight task models with SubLN, attention distillation, and warm-start pretraining to deliver large memory savings and faster CPU inference βread the paper
Retrieval & Knowledge Access
πRAG-Anything: All-in-One RAG Framework (by Hong Kong University) β unify multimodal documents via dual-graph structure and cross-modal hybrid retrieval to reason over long, heterogeneous evidence βread the paper
LLM-guided Hierarchical Retrieval β impose a semantic tree over large corpora and traverse with calibrated relevance to achieve logarithmic-complexity, zero-shot retrieval on reasoning-heavy datasets βread the paper
Multimodal & Representation Learning
Scaling Language-Centric Omnimodal Representation Learning β leverage generative pretrainingβs latent cross-modal alignment and refine with contrastive learning, revealing a generation-representation scaling law βread the paper
π OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM (by NVIDIA) β align audioβvision embeddings and curate omni-modal conversations to outperform larger omni models with far fewer tokens βread the paper
Theory, Evaluation & Practice
A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning β analyze sampling-based test-time scaling and propose a hybrid that improves confidence reliability while halving sampling cost βread the paper
π The Role of Computing Resources in Publishing Foundation Model Research (by UT Austin, UCLA, Google) β quantify how compute access correlates with citations and advocate shared infrastructure to broaden participation βread the paper
Applications & Systems
AutoPR: Letβs Automate Your Academic Promotion! β transform papers into platform-optimized promotional content with a multi-agent pipeline, dramatically boosting real-world engagement βread the paper
Philosophy & Framing
Language Models Model Language β reframe evaluation with an empiricist lens where usage frequency governs language, offering guidance for designing and interpreting LLMs βread the paper
Thatβs all for today. Thank you for reading! Please send this newsletter to colleagues if it can help them enhance their understanding of AI and stay ahead of the curve.




