Add static FP8 attention support #1045
add attention quant
46749f0c
add ut
f743ffba
add llama patch
a81b5145
correct fp8
157f6d13
add utils
586462f8
merge main
591549b2
fix shape
65a467ee
tmp
da1fe7fc
clean code
4f3b0a32
yiliu30
marked this pull request as draft 85 days ago
Merge branch 'main' into quant-attn
ae3a4aa7
add ut
ceca38a6
yiliu30
marked this pull request as ready for review 85 days ago
clean
a49c09b7
Merge branch 'quant-attn' of https://github.com/intel/auto-round into…
90bf465f
fix
adc5cb3b
refine
a61bd657
clean
c4bfce03
fix
478eef09
fix
5ed5f724
fix
53f6ae8a
fix
741f818f
fix alias tensor
ae6cec51
fix ut
ffa5ac56
Merge branch 'main' into quant-attn
c7d72d57
yiliu30
added this to the 0.9.1 milestone 83 days ago
Merge branch 'main' into quant-attn
641089df
n1ck-guo
approved these changes
on 2025-11-24
Merge branch 'main' into quant-attn
61ca489d
update
b698ec43
yiliu30
enabled auto-merge (squash) 80 days ago
fix
3b363531
yiliu30
merged
c5b1c412
into main 80 days ago
yiliu30
deleted the quant-attn branch 80 days ago
yiliu30
restored the head branch 80 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub