auto-round
Add static FP8 attention support
#1045
Merged

Add static FP8 attention support #1045

yiliu30 merged 27 commits into main from quant-attn
yiliu30
yiliu30 add attention quant
46749f0c
yiliu30 add ut
f743ffba
yiliu30 add llama patch
a81b5145
yiliu30 correct fp8
157f6d13
yiliu30 add utils
586462f8
yiliu30 merge main
591549b2
yiliu30 fix shape
65a467ee
yiliu30 tmp
da1fe7fc
yiliu30 clean code
4f3b0a32
yiliu30 yiliu30 marked this pull request as draft 85 days ago
yiliu30 Merge branch 'main' into quant-attn
ae3a4aa7
yiliu30 add ut
ceca38a6
yiliu30 yiliu30 marked this pull request as ready for review 85 days ago
yiliu30 clean
a49c09b7
yiliu30 Merge branch 'quant-attn' of https://github.com/intel/auto-round into…
90bf465f
yiliu30 fix
adc5cb3b
yiliu30 refine
a61bd657
yiliu30 yiliu30 requested a review from n1ck-guo n1ck-guo 85 days ago
yiliu30 yiliu30 requested a review from wenhuach21 wenhuach21 85 days ago
yiliu30 clean
c4bfce03
yiliu30 fix
478eef09
yiliu30 fix
5ed5f724
yiliu30 fix
53f6ae8a
yiliu30 fix
741f818f
yiliu30 fix alias tensor
ae6cec51
yiliu30 fix ut
ffa5ac56
yiliu30 yiliu30 requested a review from copilot-pull-request-reviewer copilot-pull-request-reviewer 84 days ago
copilot-pull-request-reviewer
copilot-pull-request-reviewer commented on 2025-11-20
yiliu30 Merge branch 'main' into quant-attn
c7d72d57
yiliu30 yiliu30 added this to the 0.9.1 milestone 83 days ago
yiliu30 Merge branch 'main' into quant-attn
641089df
n1ck-guo
n1ck-guo commented on 2025-11-24
n1ck-guo
n1ck-guo commented on 2025-11-24
n1ck-guo
n1ck-guo
n1ck-guo approved these changes on 2025-11-24
yiliu30 Merge branch 'main' into quant-attn
61ca489d
yiliu30 update
b698ec43
yiliu30 yiliu30 enabled auto-merge (squash) 80 days ago
yiliu30 fix
3b363531
yiliu30 yiliu30 merged c5b1c412 into main 80 days ago
yiliu30 yiliu30 deleted the quant-attn branch 80 days ago
yiliu30 yiliu30 restored the head branch 80 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone