server : refactor slot input data, move tokenizer to HTTP thread #10023
server : refactor slot input data, move tokenizer to HTTP thread
125835b2
move prompt_tokens.empty() check
5c749bea
Merge branch 'master' into xsn/refactor_server_slot_input
3abc3396
fix incorrect if branch
60d4194b
fix infinite generation loop
b550011b
bring back infill validation
cff97ad3
add infill test
fea5ca45
try fixing format_infill
07381f7d
fix test
c34ab08a
remove redundant code
575b1332
rename completion to inference
4a9f3e76
update docs
13ee7793
ngxson
marked this pull request as ready for review 351 days ago
use llama_tokens everywhere
7f7acdbe
ggerganov
approved these changes
on 2024-10-24
ngxson
merged
958367bf
into master 351 days ago
ngxson
commented
on 2024-10-31
Assignees
No one assigned
Labels
examples
python
server
Login to write a write a comment.
Login via GitHub