text-generation-inference
db922eb7 - Update to attention-kernels 0.2.0 (#2950)

Commit
1 year ago
Update to attention-kernels 0.2.0 (#2950) This version removes our patches/custom API. Makes it simpler to get changes from upstream. One of which is that we can enable FP8 KV cache for paged attention as well.
Author
Parents
Loading