llama.cpp
vulkan : add backend registry / device interfaces
#9721
Merged

vulkan : add backend registry / device interfaces #9721

slaren merged 3 commits into master from sl/vulkan-reg-2
slaren
slaren299 days ago
No description provided.
github-actions github-actions added Vulkan
github-actions github-actions added ggml
slaren slaren force pushed from 11cb93a5 to e25c9c12 298 days ago
slaren slaren marked this pull request as ready for review 298 days ago
slaren
slaren298 days ago

@0cc4m This PR has two additional changes:

  • Translates the device index in ggml_backend_vk_get_device_description (I believe this was a bug)
  • Changes the names of the backends/buffers etc to Vulkan<idx>. This is the intended use for the name of these objects, a more detailed description can now be obtained using the ggml-backend device interface.

After this change it is possible to use Vulkan and CUDA in the same llama.cpp build (you may have the disable the NVIDIA devices in the Vulkan backend using the GGML_VK_VISIBLE_DEVICES environment variable).

0cc4m 0cc4m requested a review from 0cc4m 0cc4m 297 days ago
slaren vulkan : add backend registry / device interfaces
5f4e30dd
slaren slaren force pushed from e25c9c12 to 9e04f2cb 294 days ago
slaren llama : print devices used on model load
20ca856a
slaren slaren force pushed from 9e04f2cb to 20ca856a 294 days ago
MaggotHATE
MaggotHATE293 days ago

Seems to work fine (Win10), but I'm noticing another increase in layer size. Previously with Mistral-Nemo-Instruct-2407.q5_k_l I could offload 5 layers on 3GB VRAM, now it's only 3. Is it expected? The total VRAM usage is pretty much the same as before backend registry updates.

slaren
slaren293 days ago👍 1

I don't think there are any changes here that could increase the memory usage. It's just exposing existing functionality of the vulkan backend through a different interface.

0cc4m
0cc4m285 days ago

@slaren Thank you for implementing this. I can confirm it builds on Linux and that the code looks good. I can't fully test it currently since my server is still disassembled cause I'm in the process of moving between cities. I should be able to reassemble it this weekend, but I'm still very busy. You can decide if you prefer to wait or if you think it's ready to merge.

slaren
slaren285 days ago

Can you check the changes to ggml_backend_vk_get_device_description? Previously, it wouldn't translate the device index to the indexes given by GGML_VK_VISIBLE_DEVICES, which I believe was a bug. Other than that, I think that there is very little chance that this PR breaks anything.

0cc4m
0cc4m285 days ago👍 1

That was a bug, yeah.

ggerganov
ggerganov approved these changes on 2024-10-16
slaren Merge remote-tracking branch 'origin/master' into sl/vulkan-reg-2
2363a480
slaren slaren merged f010b77a into master 285 days ago
slaren slaren deleted the sl/vulkan-reg-2 branch 285 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone