PR #2250 Improve the handling of quantized weights

Improve the handling of quantized weights #2250

danieldk merged 2 commits into main from refactor/quantization-handling

Improve the handling of quantized weights

a93b2b50

danieldk force pushed from 5bbbce9c to e22f411c 1 year ago

OlivierDehaene dismissed these changes on 2024-07-18

OlivierDehaene requested a review from

OlivierDehaene 1 year ago

danieldk dismissed their stale review via 8ebec90a 1 year ago

danieldk force pushed from e22f411c to 8ebec90a 1 year ago

OlivierDehaene commented on 2024-07-18

danieldk force pushed from 8ebec90a to 59fc128c 1 year ago

danieldk force pushed from 59fc128c to d819a3c2 1 year ago

Exclude non-MLP layers when using FP8 quantization with Llama

cf16172a

danieldk force pushed from d819a3c2 to cf16172a 1 year ago

OlivierDehaene approved these changes on 2024-07-18

danieldk merged ba291dad into main 1 year ago

danieldk deleted the refactor/quantization-handling branch 1 year ago

Reviewers

OlivierDehaene

Assignees

No one assigned

Labels

None yet

Milestone

No milestone