Avatar 🎧

Synaptic Radio

I'm Anshu, a machine learning engineer with nine years of experience turning first-principles ideas into prototypes and production systems. Here, I share ideas, practical insights and best practices to help fellow engineers accelerate their ML projects.

  1. Home
  2. Archives
  3. Categories
  4. Tags
  5. Search
    1. Dark Mode

Categories

Machine Learning NLP LLM

Tags

LLM Transformers Attention Deep-Learning Inference Optimization Nlp Speculative Decoding
Machine Learning NLP LLM

Attention Mechanisms - tracking the evolution + pair programming in pytorch

A comprehensive exploration of attention mechanisms in transformers and how they enable models to selectively focus on relevant information.

May 16, 2025
17 minute read
Machine Learning NLP

Speculative Decoding: 2x to 4x speedup of LLMs without quality loss

Understand how speculative decoding achieves 2-4x faster LLM inference without compromising output quality. This technique uses a smaller model to draft tokens that are verified in parallel by the main model, solving the memory bandwidth bottleneck.

May 12, 2025
11 minute read
© 2025 Anshuman Sahoo