vllm
80fcc3ed - [Kernel] Pipe attn_logits_soft_cap through paged attention TPU kernels (#12482)

Comment changes are shownComment changes are hidden
Commit
166 days ago
[Kernel] Pipe attn_logits_soft_cap through paged attention TPU kernels (#12482) Signed-off-by: Fenghui Zhang <fhzhang@google.com>
Author
Parents
  • vllm/attention/backends
    • File
      pallas.py
Loading