PR #3022 Update the llamacpp backend

Update the llamacpp backend #3022

angt merged 17 commits into huggingface:main from angt:llamacpp-update

angt requested a review from

mfuntowicz 1 year ago

angt requested a review from

Hugoch 1 year ago

angt requested a review from

fgbelidji 1 year ago

mfuntowicz dismissed these changes on 2025-02-14

Hugoch commented on 2025-02-14

angt dismissed their stale review via 9714f015 1 year ago

Narsil commented on 2025-02-18

Narsil dismissed these changes on 2025-02-18

fgbelidji commented on 2025-02-18

angt dismissed their stale review via eeff235c 1 year ago

fgbelidji approved these changes on 2025-02-19

angt marked this pull request as ready for review 1 year ago

angt force pushed from 7461a89a to 0e681c79 1 year ago

angt force pushed from 8adb9f20 to e9d18b07 362 days ago

Build faster

bda39e42

Make --model-gguf optional

2d4aa25b

Bump llama.cpp

46bc8e6b

Enable mmap, offload_kqv & flash_attention by default

30cd3cf5

Update doc

2242d1a6

Better error message

0d01a89f

Update doc

7388468e

Update installed packages

961a133d

Save gguf in models/MODEL_ID/model.gguf

d41183a0

Fix build with Mach-O

6223b6e2

Quantize without llama-quantize

0a55bd3d

Bump llama.cpp and switch to ggml-org

38492233

Remove make-gguf.sh

46feaf62

Update Cargo.lock

aadd6249

Support HF_HUB_USER_AGENT_ORIGIN

8fe85120

Bump llama.cpp

8a79cfd0

angt force pushed from e9d18b07 to 8a79cfd0 362 days ago

Add --build-arg llamacpp_native & llamacpp_cpu_arm_arch

3f7369d1

mfuntowicz requested a review from

mfuntowicz 356 days ago

mfuntowicz approved these changes on 2025-03-10

angt merged 094975c3 into main 356 days ago

Reviewers

mfuntowicz

fgbelidji

Hugoch

Narsil

Assignees

No one assigned

Labels

None yet

Milestone

No milestone

text-generation-inference Update the llamacpp backend #3022 Merged

Update the llamacpp backend #3022

text-generation-inference
Update the llamacpp backend
#3022

Merged