vllm
Feat Dynamic Quantization for MoE Layers in GPTQ Marlin Backend
#19395
Merged

Feat Dynamic Quantization for MoE Layers in GPTQ Marlin Backend #19395

mgoin merged 5 commits into vllm-project:main from Jun-Howie:main
Jun-Howie
Jun-Howie Jun-Howie requested a review from mgoin mgoin 349 days ago
Jun-Howie Jun-Howie requested a review from robertgshaw2-redhat robertgshaw2-redhat 349 days ago
Jun-Howie Jun-Howie requested a review from tlrmchlsmth tlrmchlsmth 349 days ago
gemini-code-assist
github-actions
mgoin
mgoin commented on 2025-06-11
Jun-Howie
Jun-Howie Jun-Howie requested a review from hmellor hmellor 346 days ago
Jun-Howie Jun-Howie requested a review from jeejeelee jeejeelee 346 days ago
Jun-Howie Jun-Howie requested a review from njhill njhill 346 days ago
Jun-Howie Jun-Howie requested a review from LiuXiaoxuanPKU LiuXiaoxuanPKU 346 days ago
Jun-Howie Jun-Howie requested a review from DarkLight1337 DarkLight1337 346 days ago
Jun-Howie Jun-Howie requested a review from ywang96 ywang96 346 days ago
Jun-Howie Jun-Howie requested a review from WoosukKwon WoosukKwon 346 days ago
Jun-Howie Jun-Howie requested a review from simon-mo simon-mo 346 days ago
Jun-Howie Jun-Howie requested a review from aarnphm aarnphm 346 days ago
Jun-Howie Jun-Howie requested a review from comaniac comaniac 346 days ago
Jun-Howie Jun-Howie requested a review from alexm-redhat alexm-redhat 346 days ago
Jun-Howie Jun-Howie requested a review from zhuohan123 zhuohan123 346 days ago
Jun-Howie Jun-Howie requested a review from youkaichao youkaichao 346 days ago
mergify mergify added documentation
mergify mergify added ci/build
mergify mergify added frontend
mergify mergify added llama
mergify mergify added multi-modality
mergify mergify added rocm
mergify mergify added structured-output
mergify mergify added speculative-decoding
mergify mergify added v1
mergify mergify added tpu
mergify mergify added tool-calling
Jun-Howie Jun-Howie force pushed to 54dc8b49 346 days ago
mergify mergify removed tpu
Feat Dynamic Quantization for MoE Layers in GPTQ Marlin Backend
8afb6833
Jun-Howie Update gptq_marlin.py
c0bf14e0
Jun-Howie Update gptq_marlin.py
dcd1a338
Jun-Howie Update gptq_marlin.py
b0cfad25
Jun-Howie Update gptq_marlin.py
fb5606c8
Jun-Howie Jun-Howie force pushed from 54dc8b49 to fb5606c8 346 days ago
Jun-Howie Jun-Howie requested a review from mgoin mgoin 346 days ago
Jun-Howie
mgoin
mgoin approved these changes on 2025-06-13
mgoin mgoin added ready
mgoin mgoin removed documentation
mgoin mgoin removed rocm
mgoin mgoin removed structured-output
mgoin mgoin removed frontend
mgoin mgoin removed speculative-decoding
mgoin mgoin removed ci/build
mgoin mgoin removed v1
mgoin mgoin removed multi-modality
mgoin mgoin removed tool-calling
mgoin mgoin removed llama
mgoin mgoin added quantization
Jun-Howie
Jun-Howie
mgoin
mgoin commented on 2025-06-23
mgoin mgoin merged dd2ccf8d into main 335 days ago

Login to write a write a comment.

Login via GitHub