llama.cpp
speculative : PoC for speeding-up inference via speculative sampling
#2926
Merged

speculative : PoC for speeding-up inference via speculative sampling #2926

ggerganov merged 3 commits into master from speculative
ggerganov
JohannesGaessler
JohannesGaessler
ggerganov
JohannesGaessler
JohannesGaessler
ggerganov
zhisbug
charliexchen
goliaro
kalomaze
KerfuffleV2
JohannesGaessler
KerfuffleV2
am-randombit
ggerganov ggerganov force pushed from 22f7a9dd to fdc53e2c 2 years ago
ggerganov ggerganov changed the base branch from master to build-metal-default 2 years ago
ggerganov ggerganov changed the base branch from build-metal-default to master 2 years ago
ggerganov ggerganov force pushed from fdc53e2c to c33cd8ad 2 years ago
ggerganov speculative : initial example
c82c808d
ggerganov speculative : print encoding speed
a15ca746
ggerganov ggerganov force pushed from 5c2aad7f to a15ca746 2 years ago
ggerganov speculative : add --draft CLI arg
847896ab
ggerganov
ggerganov ggerganov merged 47068e51 into master 2 years ago
YangWang92
JianbangZ
sorasoras
ggerganov
cermeng

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone