vllm
98e7f223 - enable skipping of SW attention layers when using FP8 KV cache (#33695)

Commit
38 days ago
enable skipping of SW attention layers when using FP8 KV cache (#33695) Signed-off-by: Jonas Kuebler <kuebj@amazon.com>
Author
Parents
Loading