[CPU] GQA supports head_sink input for smooth softmax #25269
update spec
1d84450c
tianleiwu
marked this pull request as draft 173 days ago
Implement CPU
a1b51b74
merge main
4a9f45c4
add cpu test
e8b1490c
Merge branch 'main' into tlwu/gqa_head_sink
fdb8619f
reduce tests
fc6d1b3f
tianleiwu
changed the title [WIP] GQA supports per head smooth softmax [CPU] GQA supports per head smooth softmax 167 days ago
lintrunner
489dba20
tianleiwu
marked this pull request as ready for review 167 days ago
revert some files
1a1421d0
tianleiwu
changed the title [CPU] GQA supports per head smooth softmax [CPU] GQA supports head_sink input for smooth softmax 167 days ago
revert cuda/rocm
d09d8049
update doc
19e3f55f
update cuda script
60de045c
nenad1002
approved these changes
on 2025-07-09
tianleiwu
merged
cd5f91fe
into main 167 days ago
tianleiwu
deleted the tlwu/gqa_head_sink branch 167 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub