vllm
2f42a488 - [Feature] Support KV cache offloading and disagg prefill with LMCache connector. (#12953)

Commit

169 days ago

[Feature] Support KV cache offloading and disagg prefill with LMCache connector. (#12953)

References

Author

YaoJiayi

Parents

examples/offline_inference
- cpu_offload_lmcache.py
- disaggregated_prefill_lmcache.py
vllm/distributed
- kv_transfer/kv_connector
  - factory.py
  - lmcache_connector.py
- parallel_state.py