Tags
1 page
Inference Optimization
Speculative Decoding: 2x to 4x speedup of LLMs without quality loss