llama.cpp
b436edaa - server : take into account speculative limits

Commit

1 year ago

server : take into account speculative limits ggml-ci

References

#10641 - server : fix speculative decoding with context shift

Author

ggerganov

ggerganov

Committer

ggerganov

ggerganov

Parents

Loading