Add support for FP8 KV cache scales #2628
danieldk
force pushed
from
52bbb233
to
fb9bd07c
1 year ago
danieldk
force pushed
from
fb9bd07c
to
08c0b3f2
1 year ago
danieldk
force pushed
from
08c0b3f2
to
98efcb49
1 year ago
danieldk
marked this pull request as ready for review 1 year ago
Add support for FP8 KV cache scales
ba4ac963
Update FP8 KV cache test to use checkpoint with scales
1f18cb6a
danieldk
force pushed
from
98efcb49
to
1f18cb6a
1 year ago
`can_scale`: check that the attention is flashinfer
a68fae05
danieldk
merged
eab07f74
into main 1 year ago
danieldk
deleted the feature/fp8-kv-cache-scale branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub