llama.cpp
`server`: streaming of tool calls and thoughts when `--jinja` is on
#12379
Merged

`server`: streaming of tool calls and thoughts when `--jinja` is on #12379

ochafik merged 102 commits into ggml-org:master from ochafik:tool-diffs
ochafik
add common_regex w/ support for partial final matches
16c9c633
add common_json w/ support for truncated json healing
6dcff433
renaming: string_find_partial_stop (moved to common.cpp)
a95fe780
add common_chat_msg_diff
ce2f593b
partial common_chat_parse
cd3681dc
refactor parser w/ optionals
94623655
server: wire chat diffs in stream mode
6ed8a8ff
fix trigger of thinking models (must happen after thoughts are closed)
eaeed7da
nits + docs
d6e680a3
fix functionary v3.2 raw python!
64ea080a
rename: common_chat_syntax (now contains format)
c46d4da4
rm common_regex.at_start
4358d5d6
github-actions github-actions added documentation
github-actions github-actions added testing
github-actions github-actions added examples
github-actions github-actions added python
github-actions github-actions added server
Merge remote-tracking branch 'origin/master' into tool-diffs
f477288f
fix gcc compilation
e0202b37
ochafik ochafik force pushed to e0202b37 315 days ago
fix unreachable code warning after [[noreturn]] annotation
f840e3a1
fix / refactor test-regex-partial
af7391e4
fix test-chat
449917bd
rm spaces
b428b5c6
fix command r7b partial parsing (lacked args path)
668fc907
Update test_chat_completion.py
b48ab23b
refactor + test chat parser (try_consume_json_with_dumped_args, liter…
aefc8a45
return partial msg from server
22428a43
refactor partial json
5b9c5a4e
don't return empty <think></think>
3fbe84f9
test_tool_call: allow comment lines in now-multiline code strings (fo…
d4cb7fe7
accommodate yet another deepseek r1 distill fantasy syntax (<|tool▁ca…
31f5eb21
ngxson
ochafik
rm space
bddc65a9
ngxson
nit: fix python type
ea3bf032
ochafik ochafik force pushed to ea3bf032 314 days ago
ngxson
ochafik
ochafik
refactor test-chat-parser
f3bfbc6e
ochafik ochafik force pushed to f3bfbc6e 314 days ago
fix QwQ 32B tool call parsing after thoughts (hermes2)
bb7b9fea
fix thinking models + tool calls (</think> not part of trigger's capt…
f0ea3308
reinstate tool call id logic, keep track of previously generated ids
7856949f
better logs for triggers
2412b5d3
fix msg diff test
02913b0e
try_consume_regex: basic tests + fix non-partial case
c5c3482b
chat-parser: test+fix finish, incomplete methods
af79da0c
normalize args in test-chat
562800f9
consume spaces after parse_json_tool_calls
ddeb3180
Revert "fix thinking models + tool calls (</think> not part of trigge…
6c3f87ea
fix required tool calls w/ thinking models that have pre-opened think…
e2cef665
fix thinking model's initial trigger (take 2) + test qwq's template
7a61eca0
refactor chat parser (rm incomplete)
2f55571c
test groups of common_chat_msg_parser.try_consume_regex
303f6409
run most test_tool_call tests in stream + non-stream modes
e9540ad5
make functionary v3.2 parsing more strict (differentiate first match …
a8181142
send final diff from server, to close off raw python arguments
5031366c
nit: spaces
dae6a289
fix diff aggregation logic in make_any_request
f026cb04
fix test_chat_completion_with_timings_per_token & test_logprobs_stream
e7f9d3e7
add missing functional import for gcc compilation
165b5258
fix typo in test_calc_result
9d4a6f1e
fix thoughts parsing logic
64b40398
support partial content streaming in Generic mode
fbba5da9
strip reasoning (now that tags are strings and not regexes)
4dcd6532
run test_thoughts in stream mode too
56156b7a
r1: avoid partial call triggers from spaces
5dfa2f7b
fix test_thoughts / refactor expectations
91a50848
fix partial json crashes
4f78d445
fix test-chat's unparsed thought expectation
ea57e472
mtmcp
ngxson
antcodd
llowrey
ochafik
llowrey
Merge remote-tracking branch 'origin/master' into tool-diffs
1d251780
fix partial json crash after comma
42cb16f5
fix test-chat.cpp
37b4a3a7
fix gcc build of test
13d725dd
ochafik
Merge remote-tracking branch 'origin/master' into tool-diffs
a40aeadc
Merge remote-tracking branch 'origin/master' into tool-diffs
329d943b
Merge remote-tracking branch 'origin/master' into tool-diffs
e63e542a
Column01
fix regex-partial (drop reluctant repetitions conversions)
21cd34c2
partial regex: allow newlines in prefixes
5f0450db
tool-call: allow content prelude before hermes2 tool calls (for Qwen2.5)
36ecb010
Update function-calling.md
68eeff1a
ochafik
nit: spaces
12deff6a
Update tool_bench.py
d0a686b0
Merge remote-tracking branch 'origin/master' into tool-diffs
a604b2df
Inject date_string in llama 3.x + test it & functionary v2
90789cd4
github-actions github-actions added script
f-krull
f-krull commented on 2025-04-05
Inject date_string in llama 3.x + fix for functionary v2
71435cf6
add missing chrono include
543b73e8
move/fix detection of functionary v3.1 before llama 3.x, fix & test t…
e3c372c6
Merge branch 'date' into tool-diffs
387611a3
Merge remote-tracking branch 'origin/master' into tool-diffs
01a3e31c
move string_find_partial_stop & string_ends_with to common
59b87c50
add common_regex (supports partial matches)
ff353748
Update test-regex-partial.cpp
869e1a92
ggerganov ggerganov added tool calling
ochafik Update common/common.cpp
6f109fa4
ochafik Update common/regex-partial.cpp
908e12f4
ochafik Update common/regex-partial.cpp
868b442d
ochafik Update common/regex-partial.h
2ea5f5c2
partial regex: add missing iterator end checks
b275da3c
string utils: use string_views
9b620e56
direct throw to avoid ggml.h include
5c99bdc4
regex-partial: replace missed ggml_asserts
e051be68
BiffaloBuff
colout
cgruver
ericcurtin ericcurtin requested a review from copilot-pull-request-reviewer copilot-pull-request-reviewer 268 days ago
copilot-pull-request-reviewer
copilot-pull-request-reviewer commented on 2025-04-30
cgruver
drrros
strawberrymelonpanda
Merge remote-tracking branch 'origin/master' into partial-regex
afce5530
Merge branch 'partial-regex' into tool-diffs
c879a575
ochafik
strawberrymelonpanda
drrros
ericcurtin
Merge remote-tracking branch 'origin/master' into tool-diffs
ad07a3b0
fix merge
573e8c3d
strawberrymelonpanda
Merge remote-tracking branch 'origin/master' into tool-diffs
d6e1d5bf
Merge remote-tracking branch 'origin/master' into tool-diffs
6946a835
cgruver
chat-parser: remove input from exception (llm output may contain PII)
224101b4
Merge remote-tracking branch 'origin/master' into tool-diffs
6ddda107
disable failing tests from test_tool_call.py
8886c244
ochafik ochafik marked this pull request as ready for review 252 days ago
ochafik ochafik requested a review from ngxson ngxson 252 days ago
ochafik ochafik marked this pull request as draft 252 days ago
json-partial: add comments
810c4c32
unclemusclez
pwilkin
teleprint-me
strawberrymelonpanda
khimaros
unclemusclez
pwilkin
teleprint-me
unclemusclez
taha-yassine
teleprint-me
teleprint-me
pwilkin
Merge remote-tracking branch 'origin/master' into tool-diffs
f0d5df28
Merge remote-tracking branch 'origin/master' into tool-diffs
40951c81
ochafik
ochafik ochafik marked this pull request as ready for review 244 days ago
ericcurtin ericcurtin requested a review from copilot-pull-request-reviewer copilot-pull-request-reviewer 244 days ago
copilot-pull-request-reviewer
copilot-pull-request-reviewer commented on 2025-05-24
ericcurtin
ericcurtin approved these changes on 2025-05-24
cgruver
ochafik ochafik merged f5cd27b7 into master 243 days ago
ochafik
Rane2021
TheTerrasque
making
ochafik
pwilkin
ochafik
teleprint-me
julien-c
strawberrymelonpanda
CISC
ochafik
ericcurtin

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone