Hacker Neus
DSpark: Speculative decoding accelerates LLM inference [pdf]
(github.com)
665 points
by aurenvale
9 hours ago |
252 comments