llama.cpp
llama : initial ggml-backend integration
#4520
Merged

llama : initial ggml-backend integration #4520

slaren merged 24 commits into master from sl/ggml-backend-int
slaren
slaren llama : initial ggml-backend integration
8e6735ec
slaren slaren force pushed to 8e6735ec 2 years ago
slaren add ggml-metal
0808aa5a
slaren Merge remote-tracking branch 'origin/master' into sl/ggml-backend-int
94507911
slaren
slaren commented on 2023-12-19
slaren cuda backend can be used though ggml-backend with LLAMA_GGML_BACKEND_…
0c5ee7c4
slaren add ggml_backend_buffer_clear
1ac01fbb
slaren add ggml_backend_buffer_is_hos, used to avoid copies if possible when…
c8bd5d8b
slaren disable gpu backends with ngl 0
72a0c966
slaren more accurate mlock
d3e7242b
slaren unmap offloaded part of the model
c3678ca8
slaren slaren force pushed to c3678ca8 2 years ago
slaren use posix_fadvise64(.., POSIX_FADV_SEQUENTIAL) to improve performance…
52410458
slaren
slaren update quantize and lora
bcd87ca9
slaren update session copy/set to use ggml-backend
24cc3219
slaren slaren marked this pull request as ready for review 2 years ago
slaren
cebtenzzre
cebtenzzre commented on 2023-12-20
slaren use posix_fadvise instead of posix_fadvise64
f70f94df
slaren ggml_backend_alloc_ctx_tensors_from_buft : remove old print
6c045a86
slaren llama_mmap::align_offset : use pointers instead of references for out…
5834a253
slaren
slaren restore progress_callback behavior
ecb23d4a
slaren move final progress_callback call to load_all_data
8ed2a8eb
ggerganov ggerganov requested a review from ggerganov ggerganov 2 years ago
ggerganov ggerganov added high priority
ggerganov ggerganov added need feedback
ggerganov
ggerganov
ggerganov approved these changes on 2023-12-21
ggerganov
slaren
ggerganov
ggerganov cuda : fix fprintf format string (minor)
a4e191f3
ggerganov
slaren
ggerganov
slaren do not offload scales
a74b1a89
slaren Merge remote-tracking branch 'origin/master' into sl/ggml-backend-int
6a72c7f2
slaren slaren force pushed to 6a72c7f2 2 years ago
slaren
ggerganov
slaren
ggerganov
slaren
slaren llama_mmap : avoid unmapping the same fragments again in the destructor
cd4167b6
slaren Merge remote-tracking branch 'origin/master' into sl/ggml-backend-int
16582cdf
slaren remove unnecessary unmap
323881ef
slaren
slaren
slaren commented on 2023-12-21
ggerganov
slaren metal : add default log function that prints to stderr, cleanup code
f4d884f4
slaren slaren force pushed to f4d884f4 2 years ago
slaren
ggerganov
slaren slaren merged d232aca5 into master 2 years ago
slaren slaren deleted the sl/ggml-backend-int branch 2 years ago
LostRuins
slaren
LostRuins

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone