llama.cpp
llama/ggml: multi-GPU pipeline parallelism (xdev host staging) + faster model loading
#19922
Closed

llama/ggml: multi-GPU pipeline parallelism (xdev host staging) + faster model loading #19922

mxxm-t wants to merge 1 commit into ggml-org:master from mxxm-t:pipeline-parallelism
mxxm-t
pipeline-parallelism: xdev host staging + load-time toggles
dee003ea
mxxm-t mxxm-t requested a review from CISC CISC 9 days ago
mxxm-t mxxm-t requested a review from ggerganov ggerganov 9 days ago
0cc4m
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
savvadesogle
bchtrue
mxxm-t
mxxm-t
bchtrue
Andryusz
mxxm-t
Andryusz
bchtrue
mxxm-t
mxxm-t mxxm-t closed this 4 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone