vllm
260d119e - [Kernel] Refactor CUTLASS kernels to always take scales that reside on the GPU (#5137)

Commit

1 year ago

[Kernel] Refactor CUTLASS kernels to always take scales that reside on the GPU (#5137)

References

Author

tlrmchlsmth

Parents