llama.cpp
e1f15b45 - vulkan: Implement set_tensor_async and the event interfaces (#18047)

Commit

169 days ago

vulkan: Implement set_tensor_async and the event interfaces (#18047) The goal is to enable the async loading code paths in llama_model_loader::load_all_data, originally from #7896. This works and the loads themselves are faster, but with host visible vidmem I think the cost of allocating/mapping vidmem moves and becomes more expensive, and I don't see a benefit by default. But with GGML_VK_DISABLE_HOST_VISIBLE_VIDMEM=1 I do see a significant improvement in model loading time.

References

#18047 - vulkan: Implement set_tensor_async and the event interfaces

Author

jeffbolznv

Parents

0e1ccf15

llama.cpp e1f15b45 - vulkan: Implement set_tensor_async and the event interfaces (#18047)

llama.cpp
e1f15b45 - vulkan: Implement set_tensor_async and the event interfaces (#18047)