llama.cpp
vulkan : add backend registry / device interfaces
#9721

Merged

vulkan : add backend registry / device interfaces #9721

slaren merged 3 commits into master from sl/vulkan-reg-2

slaren299 days ago

No description provided.

github-actions added Vulkan

github-actions added ggml

slaren force pushed from 11cb93a5 to e25c9c12 298 days ago

slaren marked this pull request as ready for review 298 days ago

slaren298 days ago

@0cc4m This PR has two additional changes:

Translates the device index in ggml_backend_vk_get_device_description (I believe this was a bug)
Changes the names of the backends/buffers etc to Vulkan<idx>. This is the intended use for the name of these objects, a more detailed description can now be obtained using the ggml-backend device interface.

After this change it is possible to use Vulkan and CUDA in the same llama.cpp build (you may have the disable the NVIDIA devices in the Vulkan backend using the GGML_VK_VISIBLE_DEVICES environment variable).

0cc4m requested a review from

0cc4m 297 days ago

vulkan : add backend registry / device interfaces

5f4e30dd

slaren force pushed from e25c9c12 to 9e04f2cb 294 days ago

llama : print devices used on model load

20ca856a

slaren force pushed from 9e04f2cb to 20ca856a 294 days ago

MaggotHATE293 days ago

Seems to work fine (Win10), but I'm noticing another increase in layer size. Previously with Mistral-Nemo-Instruct-2407.q5_k_l I could offload 5 layers on 3GB VRAM, now it's only 3. Is it expected? The total VRAM usage is pretty much the same as before backend registry updates.

slaren293 days ago👍 1

I don't think there are any changes here that could increase the memory usage. It's just exposing existing functionality of the vulkan backend through a different interface.

0cc4m285 days ago

@slaren Thank you for implementing this. I can confirm it builds on Linux and that the code looks good. I can't fully test it currently since my server is still disassembled cause I'm in the process of moving between cities. I should be able to reassemble it this weekend, but I'm still very busy. You can decide if you prefer to wait or if you think it's ready to merge.

slaren285 days ago

Can you check the changes to ggml_backend_vk_get_device_description? Previously, it wouldn't translate the device index to the indexes given by GGML_VK_VISIBLE_DEVICES, which I believe was a bug. Other than that, I think that there is very little chance that this PR breaks anything.

0cc4m285 days ago👍 1

That was a bug, yeah.

ggerganov approved these changes on 2024-10-16

Merge remote-tracking branch 'origin/master' into sl/vulkan-reg-2

2363a480

slaren merged f010b77a into master 285 days ago

slaren deleted the sl/vulkan-reg-2 branch 285 days ago

Reviewers

ggerganov

0cc4m

Assignees

No one assigned

Labels

Vulkan ggml

Milestone

No milestone

llama.cpp vulkan : add backend registry / device interfaces #9721 Merged

vulkan : add backend registry / device interfaces #9721

llama.cpp
vulkan : add backend registry / device interfaces
#9721

Merged