llama.cpp
47068e51 - speculative : PoC for speeding-up inference via speculative sampling (#2926)

Commit
2 years ago
speculative : PoC for speeding-up inference via speculative sampling (#2926) * speculative : initial example * speculative : print encoding speed * speculative : add --draft CLI arg
Author
Parents
Loading