vllm
98e7f223 - enable skipping of SW attention layers when using FP8 KV cache (#33695)

Commit

38 days ago

enable skipping of SW attention layers when using FP8 KV cache (#33695) Signed-off-by: Jonas Kuebler <kuebj@amazon.com>

References

Author

jmkuebler

Parents