text-generation-inference
db922eb7
- Update to attention-kernels 0.2.0 (#2950)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
Update to attention-kernels 0.2.0 (#2950) This version removes our patches/custom API. Makes it simpler to get changes from upstream. One of which is that we can enable FP8 KV cache for paged attention as well.
Author
danieldk
Parents
40b00275
Loading