llama.cpp
[CANN]: add the basic supports of Flash Attention kernel
#13627
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
18
Changes
View On
GitHub
[CANN]: add the basic supports of Flash Attention kernel
#13627
hipudding
merged 18 commits into
ggml-org:master
from
shibizhao:flash-attn-cann
cann: add the basic FA support
72df31df
cann: update the readme
3a731825
cann: update the FlashAttention with PSEShift
6a39d638
cann: update the input parameters in FA
8a902b98
cann: update the alibi with max_bias
f5e24a5c
cann: add the constrints of softcap
c8c2908b
cann: update the docs CANN.md
47f2c646
cann: update the docs CANN.md
fb62f015
github-actions
added
documentation
github-actions
added
ggml
shibizhao
changed the title
cann: add the basic supports of Flash Attention kernel
[CANN]: add the basic supports of Flash Attention kernel
218 days ago
hipudding
requested a review
from
hipudding
218 days ago
hipudding
added
Ascend NPU
noemotiovon
commented on 2025-05-21
cann: fix typo of CANN.md
b266beb2
cann: add some comments and update the CANN.md
8a112f0a
cann: update the CANN.md
1779e008
noemotiovon
commented on 2025-05-21
cann: update the inner precise for fusedInferAttention
092ccf68
noemotiovon
commented on 2025-05-22
cann: update the constraints of flash_attn_ext on ggml-cann.cpp
c380305b
cann: resolve the conflict with laster master branch
89f884e6
Merge branch 'master' into flash-attn-cann
1a3bfecb
cann: clean the whitespace
3b084d5b
cann: clean the whitespace
d23697b8
cann: add a new endline
8a7829b7
hipudding
approved these changes on 2025-05-26
hipudding
merged
2d38b6e4
into master
212 days ago
shibizhao
deleted the flash-attn-cann branch
212 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
hipudding
noemotiovon
Assignees
No one assigned
Labels
documentation
ggml
Ascend NPU
Milestone
No milestone
Login to write a write a comment.
Login via GitHub