text-generation-inference
[Backend] Introduce vLLM backend
#2976
Open

[Backend] Introduce vLLM backend #2976

mfuntowicz wants to merge 9 commits into main from vllm/setup
mfuntowicz
mfuntowicz203 days ago
No description provided.
mfuntowicz backend(vllm): initial commit
cfd22726
mfuntowicz backend(vllm): statically allocate LLMEngine
bd2ec03d
mfuntowicz backend(vllm): plug in the tokio server and CLI
02e4b9ab
mfuntowicz backend(vllm): submit new request to vLLM engine
a7c2a470
mfuntowicz backend(vllm): remove python print stmt
dc5addae
mfuntowicz backend(vllm): make v1 the default
7028f5bc
mfuntowicz backend(vllm): expose FFI for CompletionOutput and RequestOutput on R…
32dffcff
mfuntowicz backend(vllm): map ResultOutput to InferStreamResponse to stream back…
003163a2
mfuntowicz backend(vllm): disable metrics for now
5452c129
mfuntowicz mfuntowicz requested a review from Narsil Narsil 203 days ago
mfuntowicz mfuntowicz requested a review from Hugoch Hugoch 203 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone