vllm
[Hardware][NV] Fix Modelopt model loading for k-v-scales for Llama models.
#11787
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
6
Changes
View On
GitHub
[Hardware][NV] Fix Modelopt model loading for k-v-scales for Llama models.
#11787
simon-mo
merged 6 commits into
vllm-project:main
from
pavanimajety:modelopt-k-v-scales
mgoin
requested a review
from
mgoin
337 days ago
pavanimajety
force pushed
337 days ago
pavanimajety
marked this pull request as ready for review
337 days ago
mgoin
commented on 2025-01-08
pavanimajety
changed the title
[Hardware][NV] Fix Modelopt model loading for k-v-scales
[Hardware][NV] Fix Modelopt model loading for k-v-scales for Llama models.
327 days ago
pavanimajety
force pushed
327 days ago
[Hardware][NV] Fix Modelopt model loading for k-v-scales
16e4650e
Format
c02df146
NFC: remove print
aee253a5
Address Feedback
229ebe8e
Add scales to mixtral models as well
705cf4e6
pavanimajety
force pushed
to
705cf4e6
322 days ago
pavanimajety
requested a review
from
mgoin
322 days ago
Merge branch 'main' into modelopt-k-v-scales
045e2000
mgoin
approved these changes on 2025-01-27
mgoin
added
quantization
mgoin
added
ready
mgoin
enabled auto-merge (squash)
317 days ago
disabled auto-merge
315 days ago
Manually disabled by user
simon-mo
merged
b02fd288
into main
315 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
mgoin
Assignees
No one assigned
Labels
quantization
ready
Milestone
No milestone
Login to write a write a comment.
Login via GitHub