vllm
2326814c
- renaming for consistency
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Hide Minimap (CTRL+M)
Commit
136 days ago
renaming for consistency Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>
References
#12588 - [WIP] MLA decode attention - cuda graph support
Author
LucasWilkinson
Parents
534cd000
Files
6
csrc
cache.h
cache_kernels.cu
torch_bindings.cpp
vllm
_custom_ops.py
attention/backends
abstract.py
mla
utils.py
Loading