server : refactoring (wip)
f4e6e7e6
server : remove llava/clip objects from build
ef7eb339
server : fix empty prompt handling + all slots idle logic
134f5fec
server : normalize id vars
ad1d746c
server : code style
fef64c58
server : simplify model chat template validation
b1b3ba88
server : code style
f4800d54
server : minor
7635b13a
llama : llama_chat_apply_template support null buf
f84809b7
server : do not process embedding requests when disabled
22ae1a62
server : reorganize structs and enums + naming fixes
cb3ce0bf
server : merge oai.hpp in utils.hpp
4a2d5f63
server : refactor system prompt update at start
61b63705
server : disable cached prompts with self-extend
aef02b11
server : do not process more than n_batch tokens per iter
bfb121fd
server: tests: embeddings use a real embeddings model (#5908)
79ef3c05
server, tests : bump batch to fit 1 embedding prompt
36e12f8f
server: tests: embeddings fix build type Debug is randomly failing (#…
59850f18
phymbert
approved these changes
on 2024-03-06
server: tests: embeddings, no need to wait for server idle as it can …
3166ccf5
server: refactor: clean up http code (#5912)
c50a5100
ggerganov
marked this pull request as ready for review 1 year ago
server : avoid n_available var
c53d84ec
server: refactor: better http codes
9c8d3c8a
ngxson
commented
on 2024-03-06
ngxson
commented
on 2024-03-06
ngxson
commented
on 2024-03-06
server : simplify json parsing + add comment about t_last
fd74b5ea
server : rename server structs
234ab58a
server : allow to override FQDN in tests
818d898f
server : add comments
87a4a105
ggerganov
merged
2002bc96
into master 1 year ago
ggerganov
deleted the gg/refactor-server branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub