llama.cpp
9a590c82
- CUDA: optimize MMQ int8 tensor core performance (#8062)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
1 year ago
CUDA: optimize MMQ int8 tensor core performance (#8062) * CUDA: optimize MMQ int8 tensor core performance * only a single get_mma_tile_x_k function * simplify code, make functions constexpr
References
#8062 - CUDA: optimize MMQ int8 tensor core performance
Author
JohannesGaessler
Parents
52fc8705
Loading