llama.cpp
`server`: add `--reasoning-budget 0` to disable thinking (incl. qwen3 w/ enable_thinking:false)
#13771
Merged

Commits
  • server: fix/test add_generation_prompt
    ochafik committed 325 days ago
  • tools: enable hermes2/qwen chat logic even w/o tools
    ochafik committed 325 days ago
  • server: add --reasoning-format=disabled to disable thinking (incl. qwen3 w/ enable_thinking:false)
    ochafik committed 325 days ago
  • Update README.md
    ochafik committed 325 days ago
  • Add models/templates/Qwen-Qwen3-0.6B.jinja
    ochafik committed 325 days ago
  • update --reasoning-format={disabled -> nothink} as suggested
    ochafik committed 325 days ago
  • fix command r7b's nothink w/ official template
    ochafik committed 325 days ago
  • rewrite docs as list as suggested
    ochafik committed 325 days ago
  • Update common/chat.cpp
    ochafik committed 325 days ago
  • Merge branch 'enable-thinking' of github.com:ochafik/llama.cpp into enable-thinking
    ochafik committed 325 days ago
  • const char* return for chat enum name helpers
    ochafik committed 325 days ago
  • switch to --reasoning-budget flag
    ochafik committed 324 days ago
  • Merge branch 'fix-gen-prompt' into enable-thinking
    ochafik committed 324 days ago
Loading