Categories
2 pages
Machine Learning
Attention Mechanisms - tracking the evolution + pair programming in pytorch
Speculative Decoding: 2x to 4x speedup of LLMs without quality loss