[CUDA] Support FP8 (E4M3) KV Cache for Group Query Attention #27321
fix build errors
9aa8deb7
Support fp8 kv cache
019a2b15
update comments
abcbdadc
tianleiwu
marked this pull request as draft 122 days ago
update doc
3a260ef8
udpate io binding helper type mapping
9ad73e43
copilot feedback
e83141ab
update test
d28d92da
tianleiwu
marked this pull request as ready for review 122 days ago
lintrunner
2a52fc15
consolidate cuda type
068d0529
Merge remote-tracking branch 'origin/main' into tlwu/20260211/gqa_fp8…
b073388c
refine
3d507bd0
fix build
2a257802
tianleiwu
enabled auto-merge (squash) 116 days ago
tianleiwu
merged
19c9efc4
into main 116 days ago
tianleiwu
deleted the tlwu/20260211/gqa_fp8_kv_cache branch 116 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub