llama.cpp
33f890e5 - vulkan: support flash attention GQA/split_k with small batches (#18938)

Commit

85 days ago

vulkan: support flash attention GQA/split_k with small batches (#18938)

References

#18938 - vulkan: support flash attention GQA/split_k with small batches

Author

jeffbolznv

jeffbolznv

Parents

Loading