llama.cpp
llama-server : implement universal assisted decoding
#12635
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
27
Changes
View On
GitHub
Commits
llama-server : implement universal assisted decoding
g2mt
committed
1 year ago
Merge branch 'master' into master
g2mt
committed
1 year ago
Merge remote-tracking branch 'fork/master' into universal-decoding
g2mt
committed
330 days ago
Erase prompt tail for kv-cache
g2mt
committed
330 days ago
set vocab_dft_compatible in common_speculative
g2mt
committed
330 days ago
rename ctx_main to ctx_tgt
g2mt
committed
330 days ago
move vocab_dft_compatible to spec struct
g2mt
committed
330 days ago
clear mem_dft, remove mem
g2mt
committed
330 days ago
detokenize id_last for incompatible models
g2mt
committed
330 days ago
update comment
g2mt
committed
330 days ago
add --spec-replace flag
g2mt
committed
330 days ago
accept special tokens when translating between draft/main models
g2mt
committed
330 days ago
Merge remote-tracking branch 'upstream/master'
g2mt
committed
330 days ago
Escape spec-replace
g2mt
committed
330 days ago
Merge branch 'ggml-org:master' into master
g2mt
committed
326 days ago
clamp draft result to size to params.n_draft
g2mt
committed
326 days ago
Merge branch 'ggml-org:master' into master
g2mt
committed
320 days ago
Merge branch 'ggml-org:master' into master
g2mt
committed
314 days ago
Merge branch 'ggml-org:master' into master
g2mt
committed
300 days ago
fix comment
g2mt
committed
300 days ago
clean up code
g2mt
committed
299 days ago
restore old example
g2mt
committed
299 days ago
log common_speculative_are_compatible in speculative example
g2mt
committed
299 days ago
fix
g2mt
committed
299 days ago
Update common/speculative.cpp
g2mt
committed
298 days ago
Update common/speculative.cpp
g2mt
committed
298 days ago
Update common/speculative.cpp
g2mt
committed
298 days ago
Loading