server : refactor slot input data, move tokenizer to HTTP thread #10023
server : refactor slot input data, move tokenizer to HTTP thread
125835b2
move prompt_tokens.empty() check
5c749bea
Merge branch 'master' into xsn/refactor_server_slot_input
3abc3396
fix incorrect if branch
60d4194b
fix infinite generation loop
b550011b
bring back infill validation
cff97ad3
add infill test
fea5ca45
try fixing format_infill
07381f7d
fix test
c34ab08a
remove redundant code
575b1332
rename completion to inference
4a9f3e76
update docs
13ee7793
ngxson
marked this pull request as ready for review 1 year ago
use llama_tokens everywhere
7f7acdbe
ggerganov
approved these changes
on 2024-10-24
ngxson
merged
958367bf
into master 1 year ago
ngxson
commented
on 2024-10-31
Assignees
No one assigned
Labels
examples
python
server
Login to write a write a comment.
Login via GitHub