Add GGML_HIP_ROCWMMA_FATTN to enable rocWMMA for FlashAttention #12032
Add GGML_HIP_ROCWMMA_FATTN and rocwmma header check
206d22bd
Add rocWMMA support
02369da4
Merge branch 'master' into pr
547115da
Update ggml/src/ggml-hip/CMakeLists.txt
419f1ea9
Move comments to reduce confusion.
828577a9
Use namespace alias `wmma` instead of lots of ifdefs.
9d27c38b
Fix: FP16_MMA_AVAILABLE should not be checked in host code.
19272bfa
Always return false in `fp16_mma_available` when compiling for HIP an…
29debe14
Remove the Q->ne[1] > 8 check
5d4ab04c
Also always return false in fp16_mma_hardware_available when compiled…
55169095
Revert "Also always return false in fp16_mma_hardware_available when …
fea171f5
ggml: Make fattn use hardware warp size instead of 32
a90f4cb7
ggml: Make fattn kernel use launch bounds w/HIP
a135b4c7
IMbackK
requested changes
on 2025-03-03
Use GGML_CUDA_CC_IS_CDNA for checking CDNA architectures.
373d48ef
IMbackK
approved these changes
on 2025-03-03
IMbackK
merged
becade5d
into master 1 year ago
hjc4869
deleted the pr branch 1 year ago
Login to write a write a comment.
Login via GitHub