Improve cuBLAS performance by dequantizing on the GPU #1065
Improve cuBLAS performance with quantized models by dequantizing on t…
359b0560
Remove unused parameters
891af05e
ggerganov
approved these changes
on 2023-04-19
Fix possible synchronization issue
95cf9597
Fix windows build
18337719
slaren
merged
02d69881
into master 2 years ago
slaren
deleted the cuda-dq branch 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub