Simplify and improve CUDA graphs through use of indirect copy pointers #9017
agray3
marked this pull request as draft 1 year ago
CUDA: Simplify and improve CUDA graphs through use of indirect copy p…
e9a1be0a
agray3
force pushed
from
38f4863a
to
e9a1be0a
276 days ago
agray3
marked this pull request as ready for review 262 days ago
slaren
commented
on 2025-03-25
Addressed comments
1a2441ad
IMbackK
approved these changes
on 2025-03-26
slaren
requested changes
on 2025-03-29
fix HIP builds
a3d13183
properly sync to stream
6d7df919
removed ggml_cuda_cpy_fn_ptrs
04a73070
move stream sync before free
c255a0fd
guard to only use indirection with graphs
21fae96d
style fixes
61622c0e
slaren
approved these changes
on 2025-04-01
check for errors
fd88d2b1
slaren
merged
3f9da22c
into master 254 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub