Improve the handling of quantized weights #2250
Improve the handling of quantized weights
a93b2b50
danieldk
force pushed
from
5bbbce9c
to
e22f411c
1 year ago
danieldk
dismissed their stale review
via 8ebec90a
1 year ago
danieldk
force pushed
from
e22f411c
to
8ebec90a
1 year ago
danieldk
force pushed
from
8ebec90a
to
59fc128c
1 year ago
danieldk
force pushed
from
59fc128c
to
d819a3c2
1 year ago
Exclude non-MLP layers when using FP8 quantization with Llama
cf16172a
danieldk
force pushed
from
d819a3c2
to
cf16172a
1 year ago
danieldk
merged
ba291dad
into main 1 year ago
danieldk
deleted the refactor/quantization-handling branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub