vllm
Move query quantization to attention layer for Flashinfer & Triton.
#26534
Merged

Loading