text-generation-inference
fix: allocate tmp based on sgmv kernel if available
#2345
Merged

fix: allocate tmp based on sgmv kernel if available #2345

Narsil merged 2 commits into main from adjust-lora-tmp-tensor-allocation
drbh
drbh fix: allocate tmp based on sgmv kernel if available
e4a0bf3b
drbh fix: re add copy build artifacts step for punica kernels
7101bf29
Narsil
Narsil approved these changes on 2024-08-12
Narsil Narsil merged 4c3f8a70 into main 1 year ago
Narsil Narsil deleted the adjust-lora-tmp-tensor-allocation branch 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone