llama.cpp
`tool-call`: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars
#12034

Merged

`tool-call`: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars #12034

ochafik merged 48 commits into ggml-org:master from ochafik:tool-bench-prod

sampler: turn lazy grammar trigger words to regexes

b37779b9

add scripts/tool_bench.sh & .py

a4569111

optionally allow any spaces in json schema grammars (useful for llama…

14a43888

constrain llama json output regardless of function name if matches at…

e2ca8be6

better error when wrong function called

53266f9a

github-actions added script

github-actions added testing

github-actions added examples

github-actions added python

github-actions added server

improve error message in weather test

7833c167

add more models to tool_bench.sh

0e1a00ec

benchmark other sizes of qwen 2.5 coder

44740f7c

rm duplicate in tool_bench.sh

dd6eb97b

add missing <variant> include

0fc62182

fix lints

6fd4972a

improve "bad" qwen triggers

2e656f9f

add cast to please some gccs

fbd3c197

ditch server test request retry logic

62a1416a

fix flake8 lints

596ff7f3

nits

fe6968f3

remove any_spaces grammar option, allow extra line for airy llama jso…

1caacd5b

Update test_tool_call.py

789a3e1c

test w/ beefier qwen 2.5 coder 3b

6493a14b

revert some test_hello_world diffs

cc817a0a

diff

ead02c6d

Update test_tool_call.py

d7acf2c2

add requirements for tool_bench

0db4073e

fix test_thoughts deepseek test expectation

0ce606b9

Update README.md

a3cde169

update relaxed newline space rule in grammar tests

79ad6236

support add_generation_prompt query parameter (useful for /apply_temp…

3fe208a6

Merge remote-tracking branch 'origin/master' into tool-bench-prod

fe8c79b2

token cast tweak for gcc

99d2d802

fix warning on gcc13 w/ uninitialized variant

c7fa19ae

fix python lints

6e5a830f

ochafik marked this pull request as ready for review 296 days ago

ochafik requested a review from

ngxson 296 days ago

fix gcc13 warning

0b5d1055

fix pyright lints in tool_bench.py

7bcc5af0

Merge remote-tracking branch 'origin/master' into tool-bench-prod

d1f48d03

update readme w/ link to tool call

fc19192f

tool-bench: add --ctk, --ctv, --fa flags

60f28ef6

ggerganov approved these changes on 2025-03-03

ngxson approved these changes on 2025-03-04

Merge remote-tracking branch 'origin/master' into tool-bench-prod

2470a1c1

common_grammar_trigger: always use string value (+ optional token)

e6e9c138

add llama_grammar_trigger_pattern

5d43b726

add common_grammar_trigger.{to_json,from_json}

1317a35f

fix crashing typo

ad3caa34

avoid returning optional from parse_json

a6d78873

disable slow hello Llama-3.1-8B (chopped unescaped string witin strin…

20a2f5f8

ochafik commented on 2025-03-05

fix nit eol at eof

92e9723d

Update src/llama-grammar.cpp

01be080e

Merge remote-tracking branch 'origin/master' into tool-bench-prod

00db4651

ngxson commented on 2025-03-05

avoid ggml_assert in server for grammar triggers inconsistency

24010fe7

add comment on limits to common_grammar_trigger.to/from json speciali…

71719a6e

ngxson approved these changes on 2025-03-05

ochafik merged 669912d9 into master 288 days ago

Reviewers

ngxson

ggerganov

Assignees

No one assigned

Labels

script testing examples python server

Milestone

No milestone

llama.cpp `tool-call`: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars #12034 Merged

`tool-call`: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars #12034

llama.cpp
`tool-call`: fix Qwen 2.5 Coder support, add micro benchmarks, support trigger patterns for lazy grammars
#12034

Merged