PR #4520 llama : initial ggml-backend integration

llama : initial ggml-backend integration #4520

slaren merged 24 commits into master from sl/ggml-backend-int

llama : initial ggml-backend integration

8e6735ec

slaren force pushed to 8e6735ec 2 years ago

add ggml-metal

0808aa5a

Merge remote-tracking branch 'origin/master' into sl/ggml-backend-int

94507911

slaren commented on 2023-12-19

cuda backend can be used though ggml-backend with LLAMA_GGML_BACKEND_…

0c5ee7c4

add ggml_backend_buffer_clear

1ac01fbb

add ggml_backend_buffer_is_hos, used to avoid copies if possible when…

c8bd5d8b

disable gpu backends with ngl 0

72a0c966

more accurate mlock

d3e7242b

unmap offloaded part of the model

c3678ca8

slaren force pushed to c3678ca8 2 years ago

use posix_fadvise64(.., POSIX_FADV_SEQUENTIAL) to improve performance…

52410458

update quantize and lora

bcd87ca9

update session copy/set to use ggml-backend

24cc3219

slaren marked this pull request as ready for review 2 years ago

cebtenzzre commented on 2023-12-20

use posix_fadvise instead of posix_fadvise64

f70f94df

ggml_backend_alloc_ctx_tensors_from_buft : remove old print

6c045a86

llama_mmap::align_offset : use pointers instead of references for out…

5834a253

restore progress_callback behavior

ecb23d4a

move final progress_callback call to load_all_data

8ed2a8eb

ggerganov requested a review from

ggerganov 2 years ago

ggerganov added high priority

ggerganov added need feedback

ggerganov approved these changes on 2023-12-21

cuda : fix fprintf format string (minor)

a4e191f3

do not offload scales

a74b1a89

Merge remote-tracking branch 'origin/master' into sl/ggml-backend-int

6a72c7f2

slaren force pushed to 6a72c7f2 2 years ago

llama_mmap : avoid unmapping the same fragments again in the destructor

cd4167b6

Merge remote-tracking branch 'origin/master' into sl/ggml-backend-int

16582cdf

remove unnecessary unmap

323881ef

slaren commented on 2023-12-21

metal : add default log function that prints to stderr, cleanup code

f4d884f4

slaren force pushed to f4d884f4 2 years ago

slaren merged d232aca5 into master 2 years ago

slaren deleted the sl/ggml-backend-int branch 2 years ago

Reviewers

ggerganov

cebtenzzre

Assignees

No one assigned

Labels

high priority need feedback

Milestone

No milestone

llama.cpp llama : initial ggml-backend integration #4520 Merged

llama : initial ggml-backend integration #4520

llama.cpp
llama : initial ggml-backend integration
#4520

Merged