chat-ui
1ea2f606 - chore(models): cap GLM-5.1 max_tokens at 32768

Commit
35 days ago
chore(models): cap GLM-5.1 max_tokens at 32768 Reasoning model needs headroom for thinking traces but 128K is excessive; matches the Qwen3.5-397B-A17B precedent.
Author
Parents
Loading