vllm
80fcc3ed
- [Kernel] Pipe attn_logits_soft_cap through paged attention TPU kernels (#12482)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Hide Minimap (CTRL+M)
Commit
166 days ago
[Kernel] Pipe attn_logits_soft_cap through paged attention TPU kernels (#12482) Signed-off-by: Fenghui Zhang <fhzhang@google.com>
References
#12482 - [Kernel] Pipe attn_logits_soft_cap through paged attention TPU kernels
Author
fenghuizhang
Parents
c386c43c
Files
1
vllm/attention/backends
pallas.py
Loading