[Backend] Introduce vLLM backend #2976
backend(vllm): initial commit
cfd22726
backend(vllm): statically allocate LLMEngine
bd2ec03d
backend(vllm): plug in the tokio server and CLI
02e4b9ab
backend(vllm): submit new request to vLLM engine
a7c2a470
backend(vllm): remove python print stmt
dc5addae
backend(vllm): make v1 the default
7028f5bc
backend(vllm): expose FFI for CompletionOutput and RequestOutput on R…
32dffcff
backend(vllm): map ResultOutput to InferStreamResponse to stream back…
003163a2
backend(vllm): disable metrics for now
5452c129
Assignees
No one assigned