llama.cpp
speculative : fix n_outputs_max and remove draft-simple auto-enable
#23988

Merged

speculative : fix n_outputs_max and remove draft-simple auto-enable #23988

ggerganov merged 5 commits into master from gg/spec-fix-n-max

speculative : add common_speculative_n_max helper function

8c41b75a

cont : draft context always has n_parallel outputs

a808e890

llama : log n_outputs_max

016191d6

speculative : remove draft-simple auto-enable

6476b674

ggerganov added refactoring

github-actions added examples

github-actions added server

ci : enable server tests on PRs

2f6f998d

github-actions added devops

ggerganov marked this pull request as ready for review 4 days ago

ggerganov requested a review 4 days ago

ggerganov merged 5dcb7116 into master 4 days ago

ggerganov deleted the gg/spec-fix-n-max branch 4 days ago

Reviewers

No reviews

Assignees

No one assigned

Labels

refactoring examples devops server

Milestone

No milestone