Concepts and Methods you HAVE to Know About -> AI 101 Recap

It was a long year, packed with breakthroughs that pushed AI toward better reasoning and efficiency. Throughout this half of the year, we kept an eye on what matters most in our AI 101 series. Now we have a more complete picture of reinforcement learning (RL), model optimization techniques, continual learning, neuro-symbolic AI, multimodality, robotics, the hardware powering AI today, and other approaches that define the space.

While concepts show the general focus of AI researchers, methods illustrate a more pointed approach to solving AI issues and improving what already works. So let’s see what we have at the end of this year – a solid foundation to build on in 2026.

AI Concepts That Defined 2025

But first, the gifts 🎁
Once a year, in this magical season of giving and receiving, we offer the only chance all year to get our Premium subscription with 20% OFF. Get it before Jan 1 →

UPGRADE FOR ONLY $56/YEAR

P.S. We’re working on offering more. Plan prices will increase in early 2026.

These are recaps of the key models and concepts from the first half of 2025.

(with the Premium subscription you get both parts of the recap of 2025 as well as access to the full archive)

1. Reinforcement Learning: The Ultimate Guide to Past, Present, and Future

It is one of our most readable articles. Why? Reinforcement Learning (RL) – the idea of agents learning through trial and error – was shaping real-world systems long before it became the backbone of modern AI, unlocking new possibilities for models’ reasoning and agentic behavior and pushing them toward the category of reasoning models or AI agents. It is everywhere in the conversation right now, so we put together a clear guide to what RL is, where it all started, and where it’s headed.

AI 101: Reinforcement Learning: The Ultimate Guide to Past, Present, and Future

From the early trial-and-error concepts to today’s breakthroughs with RLHF, PPO, and GRPO, and where to go next, according to Andrej Karpathy and Richard Sutton.

www.turingpost.com/p/rlguide

2. The State of RL in 2025

Here’s a closer look at how RL has evolved throughout 2025 since the release of DeepSeek-R1, and at the key trends driving RL today, including Reinforcement Learning with Verifiable Rewards (RLVR), main policy optimization techniques, RL from human and AI feedback, RL scaling and more. We also discuss why some people say that RL is not as good as you might think.

AI 101: The State of Reinforcement Learning in 2025

What we have – and what we can carry from 2025 into 2026 – in reinforcement learning

www.turingpost.com/p/stateofrl2025

3. RLHF variations: DPO, RRHF, RLAIF

Reinforcement learning from human feedback (RLHF) became the default alignment strategy for LLMs in 2025. It nudges AI toward being more helpful and more consistent.

However, we can’t rely on just one method for trust calibration as a one-size-fits-all solution. If you want to know more about the strong alternatives to RLHF, such as Direct Preference Optimization (DPO), Reward-Rank Hindsight Fine-Tuning (RRHF), and RL from AI Feedback (RLAIF), this episode is for you.

Topic 46: RLHF variations: DPO, RRHF, RLAIF

we discuss three different methods for human alignment of LLMs that propose an alternative to the most widely-used RLHF

www.turingpost.com/p/rlhfvariants

UPGRADE TODAY

Join Premium members from top companies for only $56/year. Offer expires on Jan 1! Don’t miss out. Learn the basics and go deeper👆🏼

4. What is Continual Learning?

The next stage after efficient training of models is to teach them to keep learning new things over time without forgetting what they already know. Neural networks are not very flexible when it comes to changes in data. Here, we explore how to achieve stable continual learning and explain the approaches of two new, interesting methods:

Nested Learning idea with a new HOPE architecture for continual learning from Google,
Sparse Memory Finetuning (using memory layers) by FAIR at Meta.

AI 101: What is Continual Learning?

Can models add new knowledge without wiping out what they already know? We look at why continual learning is becoming important right now and explore the new methods emerging for it, including Google’s Nested Learning and Meta’s Sparse Memory Fine-tuning

www.turingpost.com/p/continuallearning

5. What's New in Test-Time Scaling?

Last December, we made a bold bet that 2025 would be the "Year of Inference-Time Search." Looking back, that prediction defined the entire year. Many systems shifted the focus from the training stage to inference, allowing models to think slow and thoroughly. Test-time compute is the key to influencing a model's behavior during inference, and scaling it can unlock even more from models. Here we dive deep into three outstanding approaches to test-time compute scaling:

Chain-of-Layers (CoLa), allowing for better control and optimization of reasoning models.
MindJourney test-time scaling framework blending Vision-Language Models (VLMs) and world models.
Google Cloud’s TTD-DR applying the diffusion process for test-time scaling to build a better deep research agent.

AI 101: What's New in Test-Time Scaling?

The test-time scaling journey into world models, agents, diffusion with a reminder to stay cautious and keep it controllable

www.turingpost.com/p/testtimescaling2

6. What is Neuro-Symbolic AI?

Neuro-symbolic AI (or neural-symbolic AI) is a concept that appeared long ago, evolved in waves, and is now seen by many as a strong path toward next-level AI. It combines neural networks, which learn patterns from data, with symbolic systems, that handle structured knowledge and logic, to build models that can both predict like neural nets and reason and understand like humans. So it definitely shouldn't be overlooked.

AI 101: What is Neuro-Symbolic AI?

Everything you need to know about hybrid neuro-symbolic AI, how it blends strict logic and rules with neural networks, and why it shouldn't be overlooked

www.turingpost.com/p/neurosymbolic

7. The Future of Compute: Intelligence Processing Unit (IPU) and other alternatives to GPU/TPU/CPU

Another one of our most-read and must-read articles. It’s a deep dive into what really powers AI today. We look beyond GPUs’ monopoly to explore CPUs, TPUs, ASICs, APUs, NPUs, and other hardware – what they are and where they are used. Read to learn everything you need to know about them.

AI 101: Intelligence Processing Unit (IPU) and other alternatives to GPU/TPU/CPU

Everything you need to know about CPU, GPU, TPU, ASICs, APU, NPU and others, unpacking the meaning behind these abbreviations

www.turingpost.com/p/pu

8. Inside Robotics

How to building the body of AI? Here is a comprehensive guide to the basics of robotics – how it is trained and powered – from Figure 03, Neo, Unitree robots to NVIDIA freshest updates. Very fun episode.

AI 101: Inside Robotics

Building the body of AI: How Physical AI is trained and powered – from Figure 03, Neo, Unitree robots to NVIDIA freshest updates

www.turingpost.com/p/insiderobotics

UPGRADE TO READ ALL ARTICLES

Outstanding AI Methods and Techniques in 2025

1. What matters for RL? Precision! Switching BF16 → FP16

Many paid attention to this method. To get better stability and accuracy in RL, you should change the numerical precision from the newer BF16 format to the older FP16. And here is why →

AI 101: What matters for RL? Precision! Switching BF16 → FP16

How this switch in precision influences the reinforcement learning accuracy and stability and why everyone, including Andrej Karpathy and Nathan Lambert, paid attention to this method

www.turingpost.com/p/fp16

2. What are Modular Manifolds?

Thinking Machines Lab contributed a lot of great research to the community, and one of them is a new glance at neural network optimization with modular manifolds. They step into the area of geometry-aware optimization for better stability and consistency. We explain key concepts (weights, gradients, norms, modular norms, manifolds, modular manifolds, and modular duality) to understand how modular manifolds can breathe new life into current optimizers.

AI 101: What are Modular Manifolds?

How Thinking Machines Lab is redefining neural network optimization through geometry-awareness

www.turingpost.com/p/modularmanifolds

3. What is XQuant?

Another important part of model optimization is optimizing memory use – it’s a much bigger issue than optimizing math. XQuant and its XQuant-CL variation are very interesting methods that can reduce memory use by up to 12 times, adding just a little extra compute and bypassing typical techniques like the KV cache.

AI 101: What is XQuant?

Compute is not a big deal for LLMs now, but memory is. Explore how a new XQuant method and its XQuant-CL variation can save the memory use up to 12 times

www.turingpost.com/p/xquant

4. Fusing Modalities: Basics + the New MoS Approach

AI models now work across images, text, audio, video, and more, but combining even two modalities is still a challenge. As AI systems race to become true all-rounders, multimodal fusion has become central to building models that understand how different data types work together. In this article, we explore how multimodal data is mixed, the challenges involved, and common fusion strategies, with a closer look at Meta AI and KAUST’s Mixture of States (MoS) approach, which mixes data at the state vector level within each layer using a learnable router.

AI 101: Fusing Modalities: Basics + the New MoS Approach

We discuss how models fuse different data types, why it is important now, and what is special in the new Meta and KAUST's fusion method – Mixture of States

www.turingpost.com/p/mixtureofstates

5. What is Mixture-of-Recursions (MoR)?

In 2025, developers also keep working on Transformers. Mixture-of-Recursions (MoR) is a Transformer variant that allows layers to be reused dynamically, so each token receives only the amount of computation it needs. Instead of always running through a fixed number of layers, MoR controls how deeply each token “thinks” using routing and KV caching.

AI 101: What is Mixture-of-Recursions (MoR)?

Explore a configurable version of Transformer with the big-model quality but without the big-model cost

www.turingpost.com/p/mixtureofrecursions

6. Rethinking Causal Attention

This is also the year of reconsidering attention mechanisms. Models need access to rich global context, and the CASTLE (Causal Attention with Lookahead Keys) approach is a modification of causal attention that allows it. Models can incorporate limited information from future tokens while still generating text autoregressively. Really worth exploring →

AI 101: Rethinking Causal Attention

How Causal Attention with Lookahead Keys (CASTLE) and future-aware casual masks reshape the strict left-and-right order of autoregressive models, plus attention in cause-effect relationships.

www.turingpost.com/p/rethinkingcasualattention

If you’re not done yet and want to refresh what was relevant in the first part of 2025 and what has changed, check out our two summer recaps as well:

7 AI Concepts You Should Know (AI 101 Guide Recap)

Revisiting main concepts that form current AI landscape in 2025

www.turingpost.com/p/concepts-ai101-recap

9 Methods and Techniques You Must Know (AI 101 Guide Recap)

Learn or refresh the key techniques and methods that matter most

www.turingpost.com/p/jan-jul-recap-tecni

Next week, we’ll recap the models we covered this year.

Stay tuned – and upgrade to receive all our deep dives directly in your inbox.