openvino
[CPU] Implement X-Attention for intel CPU
#32086
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
33
Changes
View On
GitHub
[CPU] Implement X-Attention for intel CPU
#32086
maxnick
merged 33 commits into
openvinotoolkit:master
from
mangguo321:mang/xatt
mangguo321
requested a review
255 days ago
mangguo321
requested a review
255 days ago
github-actions
added
category: CPU
github-actions
added
category: build
mangguo321
marked this pull request as draft
255 days ago
zhangYiIntel
assigned
zhangYiIntel
241 days ago
zhangYiIntel
requested changes on 2025-09-30
zhangYiIntel
commented on 2025-09-30
liubo-intel
force pushed
from
95664397
to
7dc5202e
241 days ago
liubo-intel
force pushed
from
7dc5202e
to
51840cb6
241 days ago
liubo-intel
force pushed
from
51840cb6
to
07148e40
232 days ago
mangguo321
marked this pull request as ready for review
228 days ago
github-actions
removed
category: build
github-actions
added
category: build
mangguo321
force pushed
from
158ae447
to
c704613d
226 days ago
zhangYiIntel
commented on 2025-10-16
zhangYiIntel
approved these changes on 2025-10-16
mangguo321
force pushed
from
abce4b9b
to
cefab2c0
225 days ago
mangguo321
force pushed
from
675797d2
to
40e7ec43
221 days ago
yuxu42
requested a review
from
vshampor
220 days ago
yuxu42
requested a review
from
copilot-pull-request-reviewer
220 days ago
copilot-pull-request-reviewer
commented on 2025-10-21
maxnick
added this to the
2026.0
milestone
191 days ago
maxnick
assigned
maxnick
191 days ago
Enable xattention for intel CPU
c9b96b5e
Integrate mask to PA
b4bbb28f
Support multiple sequences for xattention
4c435044
Fix find block issue
276fd0b1
Copy key buffer before executing GEMM and bf16 precision support.
a09b3a49
revert pa conflict changes with sparse attetnion execution
2408add3
enable sparse attetnion execution
a6ff6674
Use optimized softmax kernel in X-Attention
de6a6def
Optimize block sum using hardware instructions
715f790c
Add macro definications to not support ARM platform
a9c542c3
modify xattention_block_size and PA block_size check and scale calcul…
114ef7ff
optimize sparse softmax mask in sparse mask attention: extend attn_so…
c86732d9
Add const to unchanged variable
bcede76c
Minor changes in find blocks logic and remove useless parameter.
aac68ab6
Fix clang-format issue
ca4347d2
optimize use_softmax_sparse_mask check
c0ffb6af
Apply suggestions from code review
fcf896c4
Add test case.
8d390934
Fix Clang-format issue
bcb1309c
Avoid repeated brgemm kernel creation and temp buffer creation
a654802d
Fix ARM cross compile
1419bd60
Fix CI failure
feb4d228
Fix arm compile error
2ac85cbd
Fix RISCV64 error
a92433a7
Change sum_block logic to handle column num.
9f27b2f1
Add const to unchanged variables. Set brgemm output buffer.
458a1240
fix arm testcase fail issue: Fix ARM testcase failure: Retain the C++…
dfdcdbbf
Add test case compare with reference implementation
2939bc6d
mangguo321
force pushed
from
40e7ec43
to
2939bc6d
190 days ago
maxnick
commented on 2025-11-21
Extend pagedAttention functional test to support xattention. Remove c…
e0832a6a
Minor change in test case and add condition check
4cb9e882
Add transpose.hpp
e30f1cc4
Apply review comments
defaa5f6
Fix CI failure
8c005254
maxnick
approved these changes on 2025-11-27
maxnick
merged
3ac45ad5
into master
183 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
maxnick
zhangYiIntel
liubo-intel
copilot-pull-request-reviewer
vshampor
Assignees
maxnick
zhangYiIntel
Labels
category: CPU
category: build
Milestone
2026.0
Login to write a write a comment.
Login via GitHub