llama.cpp
vulkan: improve partial offloading performance on AMD
#19976
Merged

vulkan: improve partial offloading performance on AMD #19976

0cc4m merged 7 commits into master from 0cc4m/vulkan-partial-offload-fix
0cc4m
0cc4m vulkan: fix and enable cpy_tensor_async function
6943f830
0cc4m use transfer_queue for async transfers on AMD, synchronize with timel…
5abb7d55
0cc4m update offload_op logic
ca3481f3
0cc4m fix missing transfer submission
e72fb936
0cc4m disable async transfer queue on AMD GCN
29955d39
0cc4m revert op batch size change
32adb28b
0cc4m 0cc4m requested a review from jeffbolznv jeffbolznv 12 days ago
github-actions github-actions added Vulkan
github-actions github-actions added ggml
inforithmics
0cc4m
inforithmics
characharm
0cc4m fix cpy_tensor_async checks
b2bc5eb1
0cc4m
characharm
jeffbolznv
jeffbolznv approved these changes on 2026-02-28
rhjdvsgsgks
0cc4m
0cc4m
0cc4m 0cc4m merged 31914624 into master 10 days ago
0cc4m 0cc4m deleted the 0cc4m/vulkan-partial-offload-fix branch 10 days ago
acbits
0cc4m
HumerousGorgon
tvall43
0cc4m
tvall43

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone