backend : offload large batches to GPU #6083
backend : offload large batches to GPU
5b6b4ac2
fix hip
c2dba045
code cleanup
3a774427
fix CUDA split buffers
c0fe6298
slaren
force pushed
from
dc93f5ac
to
c0fe6298
1 year ago
Update ggml-backend-impl.h
8e717e8c
cuda : fix memset without set_device
9cba8a18
imatrix : remove sched affix from weight names
98090759
sched : add a new split if the current one has too many inputs
0661e6a1
slaren
marked this pull request as ready for review 1 year ago
slaren
force pushed
1 year ago
update backends
cc9299ce
slaren
force pushed
to
cc9299ce
1 year ago
ggerganov
approved these changes
on 2024-03-18
slaren
merged
2bf8d0f7
into master 1 year ago
slaren
deleted the sl/sched-auto-offload branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub