llama.cpp
Implement non-mapped async IO for CUDA on Windows.
#7896
Merged

Implement non-mapped async IO for CUDA on Windows. #7896

slaren merged 7 commits into ggml-org:master from mtavenrath:direct_io
mtavenrath
mtavenrath Implement non-mapped async IO for CUDA on Windows. On a fast Gen5 NVM…
32dd2ef1
mtavenrath
mofosyne mofosyne added Review Complexity : Low
mtavenrath Free resources except for backend.
1ebe2078
slaren
slaren commented on 2024-06-12
slaren
slaren
mtavenrath
slaren
mtavenrath Change assertions to exceptions in llama_file, find correct cuda back…
86869fbd
mtavenrath
slaren
slaren commented on 2024-06-13
slaren
mtavenrath Apply suggestions from code review
c39d5ecd
mtavenrath
slaren
slaren Fix editorconfig and unused variable
d3131ce5
slaren Merge remote-tracking branch 'origin/master' into direct_io
45c483ce
slaren Fix issues with Windows build
f4d33f87
slaren
slaren
slaren approved these changes on 2024-06-13
mtavenrath
mtavenrath
slaren
slaren slaren merged 6a2f0b34 into master 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone