llama.cpp
cuda : fix vmm pool with multi GPU
#4620
Merged

cuda : fix vmm pool with multi GPU #4620

slaren merged 13 commits into master from sl/cuda-virt-pool-fixes
slaren
slaren cuda : fix vmm pool with multi GPU
32dc09aa
slaren hip
2c3fbf98
slaren use recommended granularity instead of minimum
a76cadad
slaren better error checking
6f35a4a6
city96
slaren
city96
slaren
city96
slaren fix mixtral
1659cd1b
slaren
JohannesGaessler
JohannesGaessler
slaren
slaren
JohannesGaessler
slaren
slaren slaren force pushed from 061d9652 2 years ago
slaren slaren force pushed 2 years ago
slaren slaren force pushed 2 years ago
slaren use cudaMemcpy3DPeerAsync
865d042d
slaren slaren force pushed to 865d042d 2 years ago
slaren use cuda_pool_alloc in ggml_cuda_op_mul_mat
32304d79
slaren consolidate error checking in ggml_cuda_set_device
692887fb
slaren remove unnecessary inlines
561f1f95
city96
slaren style fixes
0dcc1a77
city96
slaren only use vmm for the main device
23c6dd67
slaren
city96
slaren fix scratch buffer size, re-enable vmm pool for all devices
da9fc775
slaren
city96
ebudmada
slaren
slaren slaren requested a review from ggerganov ggerganov 2 years ago
slaren slaren requested a review from JohannesGaessler JohannesGaessler 2 years ago
JohannesGaessler
JohannesGaessler commented on 2023-12-26
ggerganov
ggerganov approved these changes on 2023-12-26
slaren remove unnecessary check id != g_main_device
f097bed5
slaren slaren merged dc68f005 into master 2 years ago
slaren slaren deleted the sl/cuda-virt-pool-fixes branch 2 years ago
phalexo

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone