Add G2-specific `FP8_STATIC` support #1148
add attention quant
46749f0c
add ut
f743ffba
add llama patch
a81b5145
correct fp8
157f6d13
add utils
586462f8
merge main
591549b2
fix shape
65a467ee
enable compile for hpu
e9157967
compile rtn
be5d94ed
add cmd
9d09edec
udapte cmd
2d2c3122
fix atten
cdcc5c4f
fix
a7b6c33a
fix
0a2de215
fix q scale shape
ddc59772
clean max
1c989d90
merge main
76389619
clean
cc24671b
clean
ede9b27e
fix
45592e19
clean
9610c07d
fix
b311d72d
fix
8c5e0271
revert
e9ad45be
Merge branch 'main' into quant-attn-hpu-up
d0c3f949
clean code
5a3954f3
fix
3a75ae52
fix
4bfd73a1
clean
f81cad7b
remove test cmd
47bb10de
[pre-commit.ci] auto fixes from pre-commit.com hooks
3278bb46
Update base.py
d191bbac
Merge branch 'main' into quant-attn-hpu-up
b93ae29d
Merge branch 'main' into quant-attn-hpu-up
4ac50213
clean
f39b251b
n1ck-guo
approved these changes
on 2025-12-19
yiliu30
merged
547d43f8
into main 100 days ago
yiliu30
deleted the quant-attn-hpu-up branch 100 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub