vllm
4238bc82 - [Core] Cross-attention KV caching and memory-management (towards eventual encoder/decoder model support) (#4837)

Commit

1 year ago

[Core] Cross-attention KV caching and memory-management (towards eventual encoder/decoder model support) (#4837)

References

Author

afeldman-nm

Parents