llama.cpp
5d44db60 - server, webui: support continue generation on reasoning models (#22727)

Commit

15 hours ago

server, webui: support continue generation on reasoning models (#22727) * server, webui : support continue generation on reasoning models (#22727) Remove the throw blocking assistant prefill on reasoning models and orchestrate thinking tags around the prefilled message so the parser routes the next stream chunks correctly. WebUI drops the reasoning guard on the Continue button, sends reasoning_content with the prefilled message and persists partial reasoning on stop so the CoT survives reload and resume. Scope : templates with a simple thinking_start_tag / thinking_end_tag pair. Channel-based templates like GPT-OSS are out of scope, pending a per-template prefill API in common/chat. First step toward #21754. * chore: update webui build output * server: reject reasoning prefill on channel based templates

References

#22727 - server, webui: support continue generation on reasoning models

Author

ServeurpersoCom

Parents

3796c94b

llama.cpp 5d44db60 - server, webui: support continue generation on reasoning models (#22727)

llama.cpp
5d44db60 - server, webui: support continue generation on reasoning models (#22727)