llama.cpp
CUDA memory pool with async memory allocation/deallocation
#3903

Merged

CUDA memory pool with async memory allocation/deallocation #3903

ggerganov merged 3 commits into ggml-org:master from young-developer:cuda-memory-pool

Using cuda memory pools for async alloc/dealloc.

08868a44

If cuda device doesnt support memory pool than use old implementation.

7e6f4132

young-developer changed the title ~~CUDA memory pool with async memory allocation deallocation~~ CUDA memory pool with async memory allocation/deallocation 2 years ago

slaren commented on 2023-11-02

Removed redundant cublasSetStream

587ff3bf

slaren approved these changes on 2023-11-02

ggerganov approved these changes on 2023-11-02

ggerganov merged d6069051 into master 2 years ago

young-developer deleted the cuda-memory-pool branch 2 years ago

Reviewers

ggerganov

slaren

Assignees

No one assigned

Labels

None yet

Milestone

No milestone

llama.cpp CUDA memory pool with async memory allocation/deallocation #3903 Merged

CUDA memory pool with async memory allocation/deallocation #3903

llama.cpp
CUDA memory pool with async memory allocation/deallocation
#3903

Merged