llama.cpp
47068e51
- speculative : PoC for speeding-up inference via speculative sampling (#2926)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
2 years ago
speculative : PoC for speeding-up inference via speculative sampling (#2926) * speculative : initial example * speculative : print encoding speed * speculative : add --draft CLI arg
References
#2926 - speculative : PoC for speeding-up inference via speculative sampling
Author
ggerganov
Parents
8f429fa5
Loading