vllm
01a583fe
- [Kernel] Decouple Tile Size from Block Size in Triton Unified Attention Kernel (#21197)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
81 days ago
[Kernel] Decouple Tile Size from Block Size in Triton Unified Attention Kernel (#21197) Signed-off-by: Jan van Lunteren <jvl@zurich.ibm.com>
References
#21197 - [Kernel] Enable Hybrid Model Support in Triton Unified Attention Kernel
Author
jvlunteren
Parents
bc19d759
Loading