vllm
Feat Dynamic Quantization for MoE Layers in GPTQ Marlin Backend
#19395
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
5
Changes
View On
GitHub
Feat Dynamic Quantization for MoE Layers in GPTQ Marlin Backend
#19395
mgoin
merged 5 commits into
vllm-project:main
from
Jun-Howie:main
Jun-Howie
requested a review
from
mgoin
349 days ago
Jun-Howie
requested a review
from
robertgshaw2-redhat
349 days ago
Jun-Howie
requested a review
from
tlrmchlsmth
349 days ago
mgoin
commented on 2025-06-11
Jun-Howie
requested a review
from
hmellor
346 days ago
Jun-Howie
requested a review
from
jeejeelee
346 days ago
Jun-Howie
requested a review
from
njhill
346 days ago
Jun-Howie
requested a review
from
LiuXiaoxuanPKU
346 days ago
Jun-Howie
requested a review
from
DarkLight1337
346 days ago
Jun-Howie
requested a review
from
ywang96
346 days ago
Jun-Howie
requested a review
from
WoosukKwon
346 days ago
Jun-Howie
requested a review
from
simon-mo
346 days ago
Jun-Howie
requested a review
from
aarnphm
346 days ago
Jun-Howie
requested a review
from
comaniac
346 days ago
Jun-Howie
requested a review
from
alexm-redhat
346 days ago
Jun-Howie
requested a review
from
zhuohan123
346 days ago
Jun-Howie
requested a review
from
youkaichao
346 days ago
mergify
added
documentation
mergify
added
ci/build
mergify
added
frontend
mergify
added
llama
mergify
added
multi-modality
mergify
added
rocm
mergify
added
structured-output
mergify
added
speculative-decoding
mergify
added
v1
mergify
added
tpu
mergify
added
tool-calling
Jun-Howie
force pushed
to
54dc8b49
346 days ago
mergify
removed
tpu
Feat Dynamic Quantization for MoE Layers in GPTQ Marlin Backend
8afb6833
Update gptq_marlin.py
c0bf14e0
Update gptq_marlin.py
dcd1a338
Update gptq_marlin.py
b0cfad25
Update gptq_marlin.py
fb5606c8
Jun-Howie
force pushed
from
54dc8b49
to
fb5606c8
346 days ago
Jun-Howie
requested a review
from
mgoin
346 days ago
mgoin
approved these changes on 2025-06-13
mgoin
added
ready
mgoin
removed
documentation
mgoin
removed
rocm
mgoin
removed
structured-output
mgoin
removed
frontend
mgoin
removed
speculative-decoding
mgoin
removed
ci/build
mgoin
removed
v1
mgoin
removed
multi-modality
mgoin
removed
tool-calling
mgoin
removed
llama
mgoin
added
quantization
mgoin
commented on 2025-06-23
mgoin
merged
dd2ccf8d
into main
335 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
mgoin
robertgshaw2-redhat
tlrmchlsmth
hmellor
jeejeelee
njhill
LiuXiaoxuanPKU
DarkLight1337
ywang96
WoosukKwon
simon-mo
aarnphm
comaniac
alexm-redhat
zhuohan123
youkaichao
Assignees
No one assigned
Labels
ready
Milestone
No milestone
Login to write a write a comment.
Login via GitHub