Joined Turing Post in April 2024. Studied control systems of aircrafts at BMSTU (Moscow, Russia), where conducted several researchers on helicopter models. Now is more into AI and writing.
we explore how combining LightThinker and Multi-Head Latent Attention cuts memory and boosts performance
This is one of the hottest topics thanks to DeepSeek. Learn with us: the core idea, its types, scaling laws, real-world cases and useful resources to dive deeper
We explore the power of datasets and their integration in Hugging Face's small language models family, particularly in SmolLM2.
we discuss how to enable the Mamba Selective State Space Model (SSM) to handle multimodal data using the Mixture-of-Transformers concept and modality-aware sparsity
We explore Google's and Microsoft's advancements that implement "chain" approaches for long context and multi-hop reasoning
We dive into test-time compute and discuss five+ open-source methods for its effective scaling for deep step-by-step models' reasoning.
World models are the next big thing that enables Physical AI. Let's explore how NVIDIA makes it happen
We explore in details three RAG methods that address limitations of original RAG and meet the upcoming trends of the new year
Explore how RL can be blended with natural language