llama.cpp
Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B
#13386
Merged

Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B #13386

slaren merged 4 commits into ggml-org:master from hjc4869:no_op_offload
hjc4869
hjc4869 Add --disable-op-offload
2e747874
github-actions github-actions added testing
github-actions github-actions added examples
github-actions github-actions added ggml
slaren
slaren commented on 2025-05-08
hjc4869 Avoid negative bools in library.
31e19202
hjc4869 Fix default value of ggml_backend_sched_new
0d53a04b
Panchovix
hjc4869
Panchovix
hjc4869 hjc4869 requested a review from slaren slaren 178 days ago
jukofyork
slaren
slaren commented on 2025-05-10
slaren
jukofyork
hjc4869 Rename to --no-op-offload for consistency
eae3a319
hjc4869 hjc4869 changed the title Add `--disable-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B Add `--no-op-offload` to improve `-ot` pp perf in MoE models like llama4 400B 178 days ago
slaren
slaren approved these changes on 2025-05-11
slaren slaren merged 7f323a58 into master 177 days ago
hjc4869 hjc4869 deleted the no_op_offload branch 177 days ago
iSevenDays

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone