Attention Operator (CPU) #25156
Skeleton for Attention Operator (CPU)
39ff7993
Update onnxruntime/core/providers/cpu/llm/attention.cc
ec9348ad
Merge branch 'main' of https://github.com/microsoft/onnxruntime into …
7ea6cbc1
first draft for attention
2174ab61
First working attention scenario
d03321ca
fix build issues
a6d803b3
fix with mask
72d8213b
add mask
ad946b7b
new addition
deaa572a
add test on causal
00fa987c
improve kernel
929bd731
fix 3D
a8d19b99
Merge branch 'main' of https://github.com/microsoft/onnxruntime into …
da457b95
add softcap
d6c6c46c
fix present_value
8c2a53e0
fix gqa
bdbb773b
Fix gda past, present
88a04a34
fix qkmode
3d07fecf
add more unit test
099ba9a7
add ort_enforce
d45fd010
more enforce
08b76dac
xadupre
changed the title [DRAFT] Attention Operator (CPU) Attention Operator (CPU) 187 days ago
xadupre
marked this pull request as ready for review 187 days ago
Update onnxruntime/core/providers/cpu/llm/attention.cc
f3ccc4f8
Update onnxruntime/core/providers/cpu/llm/attention.cc
7567629d
address PR comments
bb53f0fb
improve for 2D mask
6457762c
comment
e76c56b1
refactor
ac719af3
useless variabel
6f45bb57
Merge branch 'main' of https://github.com/microsoft/onnxruntime into …
66109dbb
fix parameters
a58f41e6
Update onnxruntime/core/providers/cpu/llm/attention_helper.h
3d9d5431
Merge branch 'main' of https://github.com/microsoft/onnxruntime into …
47c32044
Update onnxruntime/core/providers/cpu/llm/attention_helper.h
e1e9ec57
Merge branch 'xadupre/attention' of https://github.com/microsoft/onnx…
c1b4e120
add support for float16
2dfb5d4d
enable gemm for float16 in Attention
34262fb6
support for float16
e31b42d0
cast
eb6eef07
pr comments
bcc7c8a4
fix input transposition
5fd5ac89
fix gemm fp16
5c89fe2b
fix issues
14786ad7
fix an error message
1a2f5fc2
enables removed tes
876c8eff
fix fp16 implementation
6b393ba1
Merge branch 'main' of https://github.com/microsoft/onnxruntime into …
94466597
fix missing cast
5b826d30
disable one warning
1035cf8c
disable attention 3d tests
1456623a
update test cases
351c3036
remove unnecessary comment
2b936168
Merge branch 'main' of https://github.com/microsoft/onnxruntime into …
3bfc049b
disable two tests
c7eb814c
Merge branch 'main' of https://github.com/microsoft/onnxruntime into …
9cf5712e
Update onnxruntime/core/providers/cpu/llm/attention.cc
45f5556b
xadupre
dismissed their stale review
via 45f5556b
173 days ago
Merge branch 'main' of https://github.com/microsoft/onnxruntime into …
753dbbaa
tianleiwu
approved these changes
on 2025-07-25
xadupre
merged
c3499d78
into main 170 days ago
xadupre
deleted the xadupre/attention branch 170 days ago
snnn
removed release:1.23.0
Login to write a write a comment.
Login via GitHub