llama.cpp
vulkan : add backend registry / device interfaces
#9721
Merged

vulkan : add backend registry / device interfaces #9721

slaren merged 3 commits into master from sl/vulkan-reg-2
slaren
slaren227 days ago
No description provided.
github-actions github-actions added Vulkan
github-actions github-actions added ggml
slaren slaren force pushed from 11cb93a5 to e25c9c12 226 days ago
slaren slaren marked this pull request as ready for review 226 days ago
slaren
slaren226 days ago

@0cc4m This PR has two additional changes:

  • Translates the device index in ggml_backend_vk_get_device_description (I believe this was a bug)
  • Changes the names of the backends/buffers etc to Vulkan<idx>. This is the intended use for the name of these objects, a more detailed description can now be obtained using the ggml-backend device interface.

After this change it is possible to use Vulkan and CUDA in the same llama.cpp build (you may have the disable the NVIDIA devices in the Vulkan backend using the GGML_VK_VISIBLE_DEVICES environment variable).

0cc4m 0cc4m requested a review from 0cc4m 0cc4m 225 days ago
slaren vulkan : add backend registry / device interfaces
5f4e30dd
slaren slaren force pushed from e25c9c12 to 9e04f2cb 222 days ago
slaren llama : print devices used on model load
20ca856a
slaren slaren force pushed from 9e04f2cb to 20ca856a 222 days ago
MaggotHATE
MaggotHATE221 days ago

Seems to work fine (Win10), but I'm noticing another increase in layer size. Previously with Mistral-Nemo-Instruct-2407.q5_k_l I could offload 5 layers on 3GB VRAM, now it's only 3. Is it expected? The total VRAM usage is pretty much the same as before backend registry updates.

slaren
slaren221 days ago👍 1

I don't think there are any changes here that could increase the memory usage. It's just exposing existing functionality of the vulkan backend through a different interface.

0cc4m
0cc4m213 days ago

@slaren Thank you for implementing this. I can confirm it builds on Linux and that the code looks good. I can't fully test it currently since my server is still disassembled cause I'm in the process of moving between cities. I should be able to reassemble it this weekend, but I'm still very busy. You can decide if you prefer to wait or if you think it's ready to merge.

slaren
slaren213 days ago

Can you check the changes to ggml_backend_vk_get_device_description? Previously, it wouldn't translate the device index to the indexes given by GGML_VK_VISIBLE_DEVICES, which I believe was a bug. Other than that, I think that there is very little chance that this PR breaks anything.

0cc4m
0cc4m213 days ago👍 1

That was a bug, yeah.

ggerganov
ggerganov approved these changes on 2024-10-16
slaren Merge remote-tracking branch 'origin/master' into sl/vulkan-reg-2
2363a480
slaren slaren merged f010b77a into master 213 days ago
slaren slaren deleted the sl/vulkan-reg-2 branch 213 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone