Tags
2 pages
LLM
Attention Mechanisms - tracking the evolution + pair programming in pytorch
Speculative Decoding: 2x to 4x speedup of LLMs without quality loss