llama.cpp
73e2ed3c - CUDA: use async data loading for FlashAttention (#11894)

Commit
261 days ago
CUDA: use async data loading for FlashAttention (#11894) * CUDA: use async data loading for FlashAttention --------- Co-authored-by: Diego Devesa <slarengh@gmail.com>
Parents
Loading