ochafik/llama.cpp

Pull Requests Commits

add common_chat_msg_parser

ochafik committed 358 days ago

19c64be9

llguidance : official v0.7.20 release (no actual changes) [noci] (#13594)

CoffeeVampir3 committed 358 days ago

Verified 3e0be1ca

server : do not return error out of context (with ctx shift disabled) (#13577)

ngxson committed 358 days ago

Verified 6aa892ec

webui : improve accessibility for visually impaired people (#13551)

ngxson committed 358 days ago

Verified aea9f8b4

readme : add list of dependencies and their license (#13591)

ngxson committed 358 days ago

Verified 06c1e4ab

releases : use arm version of curl for arm releases (#13592)

slaren committed 358 days ago

Verified 415e40a3

metal : add FA-vec kernel for head size 64 (#13583)

ggerganov committed 358 days ago

Verified 654a6779

llama : print hint when loading a model when no backends are loaded (#13589)

slaren committed 358 days ago

Verified 5364ae4b

ci : add ppc64el to build-linux-cross (#13575)

CISC committed 358 days ago

Verified 7c07ac24

sycl : fixed compilation warnings (#13582)

lslusarczyk committed 358 days ago

Verified 0a338ed0

minja: sync (qwen3) (#13573)

ochafik committed 359 days ago

Verified bc098c3c

gguf : use ggml log system (#13571)

slaren committed 359 days ago

Verified c6a2c9e7

gguf-py : fix disconnect-before-connect in editor-gui (#13569)

danielzgtg committed 359 days ago

Verified 07ad2b6d

convert : fix conversion for llama 4 (#13567)

ngxson committed 359 days ago

Verified c531edfa

sycl: simplify bin_bcast_kernel (#13383)

AD2605 committed 359 days ago

Verified 02cdd2d8

sycl: reordered Q4_K MMVQ (#13109)

sgeor255 committed 359 days ago

Verified 64bb51cf

sycl: use oneDNN for matrices multiplication (#12972)

lslusarczyk committed 359 days ago

Verified 9c404ed5

llama-bench : fix -ot with dl backends (#13563)

slaren committed 359 days ago

Verified 6c8b9150

webui : handle PDF input (as text or image) + convert pasted long content to file (#13562)

ngxson committed 359 days ago

Verified 3cc1f1f1

server : proper error handling for missing elements in messages array (OpenAI compatible backend) (#13540)

pwilkin committed 359 days ago

Verified c753d7be

bench : handle decode errors (#13548)

ggerganov committed 359 days ago

Verified b2838049

`server`: inject date_string in llama 3.x template + fix date for firefunction v2 (#12802)

ochafik committed 359 days ago

Verified aa48e373

kv-cache : fix out-of-bounds view during reserve graph (#13547)

ggerganov committed 360 days ago

Verified e3a9421b

arm64: optimize q6_k_q8_k kernel with i8mm (#13519)

cyb70289 committed 360 days ago

Verified 5ab5d5fb

`common`: add partial regex support (#12808)

ochafik committed 360 days ago

Verified 3198405e

editorconfig : fix trailing whitespace from #13542 (#13546)

CISC committed 360 days ago

Verified f5170c1d

fix: crash when calling `llama_state_get_size` on a context without a KV cache (#13542)

giladgd committed 360 days ago

Verified 017f10b5

CUDA: fix crash on large batch size for quant. MoE (#13537)

JohannesGaessler committed 360 days ago

Verified 4696d567

llama : fix quantize with dl backends (#13539)

slaren committed 360 days ago

Verified b7d26720

CUDA: faster Deepseek FA, add Turing support (#13435)

JohannesGaessler committed 360 days ago

Verified 6da34fa2

Older