ochafik/llama.cpp

Pull Requests Commits

Update README.md

Olivier Chafik committed 1 year ago

33efcb3c

disable some failing chatml tests

Olivier Chafik committed 1 year ago

098629df

fix --think arg env

Olivier Chafik committed 1 year ago

0917e0a8

Update README.md

Olivier Chafik committed 1 year ago

39b50c37

align Command R7B w/ --think / reasoning_content behaviour

Olivier Chafik committed 1 year ago

e6d9b524

fix compiler warning about parens

Olivier Chafik committed 1 year ago

3841a163

fix test_thoughts

ochafik committed 1 year ago

f3e9f8b6

Merge branch 'r1-toolcall' of github.com:ochafik/llama.cpp into r1-toolcall

ochafik committed 1 year ago

d20c2ce4

--think to force any model to return reasoning_content (or just parse <think> for deepseek r1)

ochafik committed 1 year ago

9d7c3cc5

metal : adjust support conditions for norm operators (#11671)

ggerganov committed 1 year ago

Verified d774ab3a

CUDA: support for mat. mul. with ne03 != ne13 (#11656)

JohannesGaessler committed 1 year ago

Verified fa62da9b

llava: add quantization for the visual projector LLAVA, Qwen2VL (#11644)

samkoesnadi committed 1 year ago

Verified 1ec20808

Merge branch 'master' into r1-toolcall

ochafik committed 1 year ago

Verified 1f1f06aa

`sync`: minja (#11641)

ochafik committed 1 year ago

Verified 9f4cc8f8

CUDA: non-contiguous (RMS) norm support (#11659)

JohannesGaessler committed 1 year ago

Verified fd08255d

HIP: force max threads per block to be 1024 (#11621)

fxzjshm committed 1 year ago

Verified 3ec9fd4b

Update test_tool_call.py

Olivier Chafik committed 1 year ago

5d60cebb

server : add try..catch to places not covered by set_exception_handler (#11620)

ngxson committed 1 year ago

Verified 3962fc1a

arg : list RPC devices first when using --list-devices (#11655)

rgerganov committed 1 year ago

Verified 1bef571f

Merge branch 'master' into r1-toolcall

Olivier Chafik committed 1 year ago

933f7a18

`tool-call`: command r7b fix for normal responses (#11608)

ochafik committed 1 year ago

Verified db288b60

update readme section about common model tool call formats

Olivier Chafik committed 1 year ago

b2d17287

return thoughts in reasoning_content field

Olivier Chafik committed 1 year ago

39c1d816

readme : add llm_client Rust crate to readme bindings (#11628)

ShelbyJenkins committed 1 year ago

Verified 106045e7

swift : fix llama-vocab api usage (#11645)

jhen0409 committed 1 year ago

Verified f117d84b

metal : use residency set for other platforms (#11648)

jhen0409 committed 1 year ago

Verified 534c46b5

authors : update

ggerganov committed 1 year ago

Verified 387a1598

ggerganov committed 1 year ago

Verified 7c9e0ca5

cmake: Add ability to pass in GGML_BUILD_NUMBER (ggml/1096)

ckastner committed 1 year ago

Verified 8f8290ad

r1: revert making <｜tool▁calls▁begin｜> optional as somehow sampling triggers us on "<｜tool▁call▁begin｜><", which is already invalid per the grammar

ochafik committed 1 year ago

d1b66910

Newer Older