llama.cpp
Introduction of CUDA Graphs to LLama.cpp
#6766
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
20
Changes
View On
GitHub
Introduction of CUDA Graphs to LLama.cpp
#6766
slaren
merged 20 commits into
ggml-org:master
from
agray3:ag_cuda_graphs
DRAFT: Introduction of CUDA Graphs to LLama.cpp
cec409aa
phymbert
added
need feedback
phymbert
added
performance
FIx issues raised in comments
c8dd0e7c
Tidied to now only use CUDA runtime (not mixed with driver calls)
800f4fe4
disable for multi-gpu and batch size > 1
c2691d96
Disable CUDA graphs for old GPU arch and with env var
df4719ec
added missing CUDA_CHECKs
c3d4ead1
JohannesGaessler
commented on 2024-04-24
Addressed comments
d403b180
agray3
changed the title
DRAFT: Introduction of CUDA Graphs to LLama.cpp
Introduction of CUDA Graphs to LLama.cpp
1 year ago
further addressed comments
40875968
limit to GGML_ALLOW_CUDA_GRAPHS defined in llama.cpp cmake
0640427f
Merge branch 'ggerganov:master' into ag_cuda_graphs
9c578616
Added more comprehensive graph node checking
d44e0fb2
With mechanism to fall back if graph capture fails
eb9f15fb
Revert "With mechanism to fall back if graph capture fails"
909e4c66
Fall back if graph capture fails and address other comments
58199503
Merge branch 'ggerganov:master' into ag_cuda_graphs
44af0964
Merge remote-tracking branch 'origin/master' into ag_cuda_graphs
4e1f2a03
- renamed GGML_ALLOW_CUDA_GRAPHS to GGML_CUDA_USE_GRAPHS
e830949e
fix build without cuda graphs
a4c9b901
ggerganov
commented on 2024-05-08
remove outdated comment
ab40e667
replace minimum cc value with a constant
f42312e0
slaren
approved these changes on 2024-05-08
JohannesGaessler
commented on 2024-05-08
JohannesGaessler
approved these changes on 2024-05-08
slaren
merged
bc4bba36
into master
1 year ago
mofosyne
added
Review Complexity : High
Login to write a write a comment.
Login via GitHub
Reviewers
JohannesGaessler
slaren
ggerganov
Assignees
No one assigned
Labels
performance
need feedback
Review Complexity : High
Milestone
No milestone
Login to write a write a comment.
Login via GitHub