Return an empty array if the JSON schema sets maxItems == 0.
Before:
$ ./build/bin/llama-cli -m ./tmp/mnt/models/quantize/gemma-1.1-2b-it.Q8_0.gguf -j "$(python -c $'import json, pydantic\nclass Result(pydantic.BaseModel): colours:list[str]=pydantic.Field(max_length=0)\nprint(json.dumps(Result.model_json_schema()))')" --no-display-prompt -p "Here are some colours: " -no-cnv ... system_info: n_threads = 8 (n_threads_batch = 8) / 24 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | BMI2 = 1 | LLAMAFILE = 1 | OPENMP = 1 | AARCH64_REPACK = 1 | parse: error parsing grammar: expecting '}' at -1})? "]" space colours-kv ::= "\"colours\"" space ":" space colours root ::= "{" space colours-kv "}" space space ::= | " " | "\n"{1,2} [ \t]{0,20} string ::= "\"" char* "\"" space char ::= [^"\\\x7F\x00-\x1F] | [\\] (["\\bfnrt] | "u" [0-9a-fA-F]{4}) colours ::= "[" space (string ("," space string){0,-1})? "]" space colours-kv ::= "\"colours\"" space ":" space colours root ::= "{" space colours-kv "}" space space ::= | " " | "\n"{1,2} [ \t]{0,20} string ::= "\"" char* "\"" space llama_grammar_init_impl: failed to parse grammar main: failed to initialize sampling subsystem
After:
$ ./build/bin/llama-cli -m ./tmp/mnt/models/quantize/gemma-1.1-2b-it.Q8_0.gguf -j "$(python -c $'import json, pydantic\nclass Result(pydantic.BaseModel): colours:list[str]=pydantic.Field(max_length=0)\nprint(json.dumps(Result.model_json_schema()))')" --no-display-prompt -p "Here are some colours: " -no-cnv ... system_info: n_threads = 8 (n_threads_batch = 8) / 24 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | BMI2 = 1 | LLAMAFILE = 1 | OPENMP = 1 | AARCH64_REPACK = 1 | sampler seed: 3851950458 sampler params: repeat_last_n = 64, repeat_penalty = 1.000, frequency_penalty = 0.000, presence_penalty = 0.000 dry_multiplier = 0.000, dry_base = 1.750, dry_allowed_length = 2, dry_penalty_last_n = 4096 top_k = 40, top_p = 0.950, min_p = 0.050, xtc_probability = 0.000, xtc_threshold = 0.100, typical_p = 1.000, top_n_sigma = -1.000, temp = 0.800 mirostat = 0, mirostat_lr = 0.100, mirostat_ent = 5.000 sampler chain: logits -> logit-bias -> penalties -> dry -> top-k -> typical -> top-p -> min-p -> xtc -> temp-ext -> dist generate: n_ctx = 4096, n_batch = 2048, n_predict = -1, n_keep = 1 {"colours": [ ] } [end of text] llama_perf_sampler_print: sampling time = 8.25 ms / 41 runs ( 0.20 ms per token, 4971.50 tokens per second) llama_perf_context_print: load time = 394.69 ms llama_perf_context_print: prompt eval time = 112.49 ms / 7 tokens ( 16.07 ms per token, 62.23 tokens per second) llama_perf_context_print: eval time = 1579.12 ms / 33 runs ( 47.85 ms per token, 20.90 tokens per second) llama_perf_context_print: total time = 2158.70 ms / 40 tokens
Fixes: #13116
Login to write a write a comment.
Login via GitHub
Return an empty array if the JSON schema sets maxItems == 0.
Before:
After:
Fixes: #13116