text-generation-webui
Add Quanto4,2, HQQ4,2 KV cache quantization support to Transformers loader
#6768
Open

Add Quanto4,2, HQQ4,2 KV cache quantization support to Transformers loader #6768

dinerburger wants to merge 3 commits into oobabooga:dev from dinerburger:quanto
dinerburger
dinerburger Get quanto4,2 KV cache working in Transformers
4a55c742
dinerburger Add optimum-quanto to requirements
d0ea0509
dinerburger Add HQQ KV cache quantization for Transformers
244d3cb6
dinerburger dinerburger changed the title Add Quanto4,2 KV cache quantization support to Transformers loader Add Quanto4,2, HQQ4,2 KV cache quantization support to Transformers loader 245 days ago
cceneag
cceneag requested changes on 2025-10-08

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone