onnxruntime
Add quantized KV cache support in CPU GroupQueryAttention
#28576
Merged

Add quantized KV cache support in CPU GroupQueryAttention #28576

tianleiwu
tianleiwu Add quantized kv cache support in GQA cpu
8c99df18
tianleiwu fix build
475506c4
tianleiwu tianleiwu requested a review from copilot-pull-request-reviewer copilot-pull-request-reviewer 3 days ago
tianleiwu tianleiwu requested a review from apsonawane apsonawane 3 days ago
copilot-pull-request-reviewer
copilot-pull-request-reviewer commented on 2026-05-19
tianleiwu tianleiwu requested a review from kunal-vaishnavi kunal-vaishnavi 3 days ago
tianleiwu Address Copilot review feedback
9c225ef5
tianleiwu tianleiwu force pushed from e9d0bbcd to 9c225ef5 3 days ago
tianleiwu remove unused code
2ad76d7a
kunal-vaishnavi
kunal-vaishnavi commented on 2026-05-20
kunal-vaishnavi
kunal-vaishnavi dismissed these changes on 2026-05-20
apsonawane
apsonawane dismissed these changes on 2026-05-20
tianleiwu Fix CI: INT4 test verifier unpack bug, update OperatorKernels.md
10e054a7
tianleiwu tianleiwu dismissed their stale review via 10e054a7 3 days ago
tianleiwu tianleiwu dismissed their stale review via 10e054a7 3 days ago
tianleiwu tianleiwu requested a review from kunal-vaishnavi kunal-vaishnavi 3 days ago
tianleiwu tianleiwu requested a review from apsonawane apsonawane 3 days ago
kunal-vaishnavi
kunal-vaishnavi approved these changes on 2026-05-20
kunal-vaishnavi kunal-vaishnavi enabled auto-merge (squash) 3 days ago
kunal-vaishnavi kunal-vaishnavi merged 6fbda6d8 into main 3 days ago
kunal-vaishnavi kunal-vaishnavi deleted the tlwu/20260519/gqa_cpu_quantized_kv_cache branch 3 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone