llama.cpp
Simplify and improve CUDA graphs through use of indirect copy pointers
#9017
Merged

Simplify and improve CUDA graphs through use of indirect copy pointers #9017

agray3
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
agray3
Nexesenex
agray3
slaren
agray3
agray3
slaren
agray3
slaren
agray3
agray3 agray3 marked this pull request as draft 1 year ago
Nexesenex
agray3
Nexesenex
agray3 CUDA: Simplify and improve CUDA graphs through use of indirect copy p…
e9a1be0a
agray3 agray3 force pushed from 38f4863a to e9a1be0a 276 days ago
agray3
agray3
IMbackK
agray3 agray3 marked this pull request as ready for review 262 days ago
agray3
slaren
slaren commented on 2025-03-25
agray3 Addressed comments
1a2441ad
agray3
IMbackK
IMbackK approved these changes on 2025-03-26
slaren
slaren requested changes on 2025-03-29
agray3 fix HIP builds
a3d13183
agray3 properly sync to stream
6d7df919
agray3 removed ggml_cuda_cpy_fn_ptrs
04a73070
agray3 move stream sync before free
c255a0fd
agray3 guard to only use indirection with graphs
21fae96d
agray3
slaren style fixes
61622c0e
slaren
slaren approved these changes on 2025-04-01
slaren check for errors
fd88d2b1
slaren slaren merged 3f9da22c into master 254 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone