llama.cpp
vulkan: Fix ErrorOutOfHostMemory on Intel GPU when loading large models with --no-mmap
#20059

Merged

vulkan: Fix ErrorOutOfHostMemory on Intel GPU when loading large models with --no-mmap #20059

0cc4m merged 10 commits into ggml-org:master from rillomas:fix-async-tensor-crash

Changed to reuse command buffers to fix crashing on Intel GPU

4b52568b

github-actions added Vulkan

github-actions added ggml

Removed unused parameter

d1dd8147

Fixed compile error and minor mistake

668d245e

Fix logging

29a1a01a

rillomas marked this pull request as ready for review 20 days ago

rillomas requested a review from

0cc4m 20 days ago

0cc4m commented on 2026-03-06

rillomas marked this pull request as draft 15 days ago

Changing to use usage flag per command buffer

e1f8ce0c

fixed style

19d54833

added buffer reset

ffed7e53

Removed cmd_buffer_idx for reuse consistency

a0fecda9

Merge remote-tracking branch 'origin/master' into fix-async-tensor-crash

498ff284

rillomas marked this pull request as ready for review 14 days ago

Fixed style

d3fab849

0cc4m approved these changes on 2026-03-12

0cc4m merged 5866e3bb into master 12 days ago

Reviewers

0cc4m

Assignees

No one assigned

Labels

Vulkan ggml

Milestone

No milestone