Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
ngxson/llama.cpp
Pull Requests
Commits
xsn/mistral_small_vision
debug_server_pref
gemma3n_mtmd
hp/split/load-model
master
poc/vision
tmp0
wsn/server_health_non_blocking
xsn/a11y
xsn/accept_pdf
xsn/add_n_support
xsn/agents_md
xsn/arch_refactor_llm_names
xsn/arg_add_catalog
xsn/arg_better_handle_hf_mmproj
xsn/arg_cpp
xsn/arg_ctk_ctv
xsn/arg_missing_ifdef
xsn/arg_mm
xsn/arg_neg_fix
xsn/arg_neg
xsn/arg_unused_var
xsn/argparser_v3
xsn/asan_arg_smpl
xsn/better_error
xsn/better_server_json_value
xsn/bug_report_add_cmd
xsn/bump_transformers
xsn/cache_missing_slash
xsn/cache_model_list
xsn/cancellable_request
xsn/chafik_webui_mcp_idea
xsn/chat_apply_template
xsn/chat_cli
xsn/chat_int_overflow
xsn/chat_template_prefix_postfix
xsn/chat_tmpl_alias
xsn/chat_tmpl_enumerate
xsn/check_vendor_ci
xsn/ci_cpu_ubuntu_20
xsn/ci_docker_no_fast_fail
xsn/ci_fix_arm64
xsn/ci_fix_arm64_2
xsn/ci_ggml_org_hosted
xsn/ci-permission
xsn/clean_up_server
xsn/cleanup_oai
xsn/cli_arrow_left_right
xsn/cli_auto_cnv
xsn/cli_buffered_logs
xsn/cli_command
xsn/cli_jinja_default
xsn/cli_move_warning
xsn/cli_server_based
xsn/clip_ffn_up_down_fix
xsn/clip_fix_model_size_display
xsn/clip_gpu
xsn/clip_improve_concat
xsn/clip_no_mmproj_offload
xsn/clip_no_print_ftype
xsn/clip_preprocessing_refactor
xsn/clip_proj_naming
xsn/clip_refactor_img_manip
xsn/clip_refactor_set_input
xsn/clip_refactor_smaller_files
xsn/clip_smart_ptr
xsn/codeowners
xsn/codeowners2
xsn/common_cpp_no_json
xsn/common_remote_get_content
xsn/compare_logits
xsn/control-vector-generator
xsn/control-vector-multiprompt
xsn/convert_fix_llama4_clash
xsn/convert_gguf_qwen2vl
xsn/convert_improve_arch_handling
xsn/convert_kimi_k2_quant_repack
xsn/convert_kimi_k2_quant
xsn/convert_mmproj_type_mean_std
xsn/convert_mmproj
xsn/convert_update_qol
xsn/correct_llama2_template
xsn/create_server_context
xsn/csm_tts_batched_decode
xsn/csm_tts
xsn/curl_ci_test
xsn/curl_on_by_default
xsn/curl_static
xsn/custom_swa_list
xsn/cvector_fix_pca
xsn/cvector-better-prompt
xsn/cvector-fix
xsn/deepseek_r1_qwen
xsn/deepseek-ocr
xsn/defer-server-task
xsn/devstral2_convert
xsn/disallow_remote_code_convert
xsn/docker_no_build_test
xsn/docs-sycl-vulkan
xsn/dotsllm1
xsn/dotsocr
xsn/download_cpp
xsn/duplicated_tensor_name
xsn/embedding_input
xsn/emscripten_webgpu
xsn/env_var_speculative
xsn/exaone_tied_embd
xsn/exceed_context_size_error
xsn/fix_audio_patch_size_zero
xsn/fix_chat_tmpl
xsn/fix_ci_test
xsn/fix_ci
xsn/fix_console_backspace
xsn/fix_curl_old_ver
xsn/fix_docker_ci
xsn/fix_empty_batch
xsn/fix_emscripten_build
xsn/fix_export_lora_2
xsn/fix_gemma2_tokenizer
xsn/fix_gemma3n_conversion
xsn/fix_get_weights
xsn/fix_imatrix_arg
xsn/fix_kimi_k2_tmpl
xsn/fix_kv_shift_qwen2vl
xsn/fix_llam4_conversion
xsn/fix_llama_api_missing
xsn/fix_llama_lora
xsn/fix_logprobs
xsn/fix_lora_convert
xsn/fix_lora_merge
xsn/fix_lora_merge_2
xsn/fix_lora
xsn/fix_main_cnv_tmpl
xsn/fix_metal_im2col
xsn/fix_mistral_chat_format
xsn/fix_order_unary_ops
xsn/fix_qwen_omni_conv
xsn/fix_qwen3_nb
xsn/fix_res_error
xsn/fix_router_ssl
xsn/fix_server_chat_template
xsn/fix_server_test_exit
xsn/fix_slow_ci
xsn/fix_sys_prompt
xsn/fix_test_timeout
xsn/fix_uhd_preprocessing
xsn/fix_ui_copy_function
xsn/fix_unsupported_chat_tmpl
xsn/fix_url_mismatch
xsn/fix-async-iterator-safari
xsn/fix-fattn-qwen25vl
xsn/fix-mrope
xsn/fix-mrope-asan-error
xsn/fix-mrope-causal
xsn/fix-server-task-lock
xsn/flash_attn_lora
xsn/full_image_less
xsn/gelu_erf_cu
xsn/gelu_na
xsn/gemma_template
xsn/gemma2_mask_swa
xsn/gemma3_lm_head
xsn/gemma3n_audio
xsn/gemma3n
xsn/gemma-multiple-system-role
xsn/ggml_cast_f32_i32
xsn/ggml_fill
xsn/ggml_repeat_4d
xsn/ggml_scale_bias
xsn/gguf_cpp_wrapper
xsn/gguf-split-size
xsn/glm4v
xsn/gptoss_non_mxfp4_conversion
xsn/helium_test
xsn/hf_offline
xsn/hf_repo_hf_file_duplicate_name
xsn/hf_repo
xsn/homecook-mistral-o
xsn/httplib_cpp_h
xsn/httplib_0_19_0
xsn/hunyuan-moe
xsn/idefics3-fix-preproc
xsn/improve_common_log
xsn/improve_server_ui
xsn/improve_server_works
xsn/improve-gen-docs
xsn/intel-oneapi
xsn/internvl
xsn/janus_pro
xsn/jinja_vm
xsn/kimi-vl
xsn/lazy_remote_tensor
xsn/lfm2_missing_tensor
xsn/lfm2_vl
xsn/lighton-ocr
xsn/llama_batch_remove_compat
xsn/llama_chat_tmpl_docs
xsn/llama_cpp_lib
xsn/llama_decode_enum
xsn/llama_lora_adapter_clear
xsn/llama_model_load_from_splits_cli
xsn/llama_model_load_from_splits
xsn/llama_set_attn_type_backup
xsn/llama_set_attn_type
xsn/llama4causal
xsn/llama4causalfix
xsn/llama4_mapping
xsn/llama4_rms_norm
xsn/llama4_scaling
xsn/llama4
xsn/llamax-demo
xsn/llava2
xsn/load_from_buffer
xsn/local_media_path
xsn/lora_convert_base_is_optional
xsn/lora_new_tokens_warn
xsn/lora_per_request
xsn/lora_server_hotswap
xsn/main_chat_template
xsn/main_chat_template_2
xsn/main_tmpl_preserve_nl
xsn/makefile_missing
xsn/master_test_decode_count
xsn/memleak_mtmd_helper
xsn/merge_llava_to_mtmd_cli
xsn/mergekit_extract_lora_compat
xsn/mimi_dec
xsn/minicpm_template2
xsn/minicpm-template
xsn/minicpmv_cli_fix
xsn/minicpmv-improve-sincos-embd
xsn/ministral3_quantized
xsn/ministral3
xsn/minor_fix_ui
xsn/missing-args
xsn/mistral_large_moe
xsn/mistral_large_scaling
xsn/mistral_small_vision
xsn/mistral_small
xsn/model_merge_with_embd
xsn/model_merge
xsn/more_try_catch_server
xsn/move_llava_to_mtmd
xsn/mrope_metal
xsn/mrope_normal_pos_text
xsn/mtmd_better_init_struct
xsn/mtmd_c_api
xsn/mtmd_cleanup_n_patches
xsn/mtmd_clip_private
xsn/mtmd_docs
xsn/mtmd_fix_batch_view_mrope
xsn/mtmd_fix_no_warmup
xsn/mtmd_fix_pub_header
xsn/mtmd_glmedge_rm_boi_eoi
xsn/mtmd_graph_builder_refactor
xsn/mtmd_helper_dedicated_file
xsn/mtmd_helper_dedicated_lib
xsn/mtmd_image_api
xsn/mtmd_improve_0
xsn/mtmd_llama4_new
xsn/mtmd_no_internal
xsn/mtmd_optimize_2d_rope
xsn/mtmd_pixtral
xsn/mtmd_qwen2vl_reduce_img_size
xsn/mtmd_qwen2vl
xsn/mtmd_refactor_audio_preproc
xsn/mtmd_remove_legacy
xsn/mtmd_rm_glm_eoi_boi
xsn/mtmd_set_log
xsn/mtmd_smolvlm
xsn/mtmd_ultravox
xsn/mtmd_warmup_bool
xsn/mtmd-cli-jinja
xsn/mtmd-initial-video-api
xsn/mtmd-max-min-pixels
xsn/need_insert_eot
xsn/nemotron-chat-template
xsn/nits_smollm3
xsn/no_curl_ggml_ci
xsn/no_n_predict_minus_2
xsn/no-warmup-arg
xsn/norway_problem
xsn/oai_add_system_fingerprint
xsn/oai_completions
xsn/oneoff_fix_mistral_tmpl
xsn/orion_chat_tmpl
xsn/paddleocr
xsn/phi3-convert
xsn/phi4_tmpl
xsn/phi-3-default-swa
xsn/phi-4-mm
xsn/pin_ci
xsn/pixtral_fix_backend
xsn/poc_cli_server_based
xsn/poc_interim_server
xsn/poc_proxy_router
xsn/poc_proxy_2
xsn/poc_proxy_3
xsn/private_batch_api
xsn/python_quantize_k
xsn/quantize_mtmd
xsn/qwen_allow_large_img_default
xsn/qwen_embd_pooling
xsn/qwen_vl_max_res
xsn/qwen2audio
xsn/qwen2vl_fix_text_pos
xsn/qwen3_embd_rerank
xsn/qwen25omni
xsn/readme_deps
xsn/redo_quant_threads
xsn/reduce_compile_time_arg
xsn/refactor_clip
xsn/refactor_cpu_dup_op
xsn/refactor_download
xsn/refactor_server_multitask_test
xsn/refactor_server_multitask
xsn/refactor_server_preset
xsn/refactor_server_slot_input
xsn/refactor_server_struct_input
xsn/refactor_server_struct_type
xsn/remove_train_fintune
xsn/renaming_server
xsn/reorganize_docs
xsn/rerank_tei_format
xsn/revert_rm_boi_eoi
xsn/revert_rm_timings
xsn/rework_get_started_docs
xsn/rm_extra_args_docs
xsn/rm_inp_one
xsn/rope_v2
xsn/router_cmd_stdout
xsn/router_no_content_length
xsn/server_anthropic_fix
xsn/server_audio
xsn/server_bench_docker
xsn/server_chat_cmpl_model
xsn/server_chat_template_detect
xsn/server_chat_template
xsn/server_clarify_kvu_np
xsn/server_clarify_slots
xsn/server_connection_is_alive
xsn/server_custom_tmpl
xsn/server_data_race
xsn/server_dev_docs
xsn/server_echo_logprobs_stream
xsn/server_embd_multitask
xsn/server_empty_prompt
xsn/server_explicit_access
xsn/server_explicit_exec_path
xsn/server_fix_stream_cancel
xsn/server_fix_2
xsn/server_functionary
xsn/server_improve_msg_diff
xsn/server_improve_spec
xsn/server_jinja_enabled_default
xsn/server_lightweight_chat_ui
xsn/server_missing_model_id
xsn/server_model_management_v1_2
xsn/server_models_autoload
xsn/server_more_args
xsn/server_more_tests
xsn/server_mtmd
xsn/server_no_cache_bug
xsn/server_no_err_out_of_ctx
xsn/server_node_22_11_0
xsn/server_params_2
xsn/server_preset_common_section
xsn/server_progress_zero
xsn/server_pytest
xsn/server_refactor_split_task_common
xsn/server_remove_gpt_3_name
xsn/server_res_error_ok_static
xsn/server_response_generator_refactor
xsn/server_router_overrides
xsn/server_separate_pos_tokens
xsn/server_shutdown_timeout
xsn/server_sleep
xsn/server_std_move
xsn/server_stop_timeout
xsn/server_sync_docs
xsn/server_task_create_state
xsn/server_thread_join_stop
xsn/server_tighten_cancel
xsn/server_tts_streamed
xsn/server_tts
xsn/server_twice_ctrl_c
xsn/server_ui_tok_per_sec
xsn/server-bring-back-stream-final-chunk
xsn/server-cleaup-oai-logic
xsn/server-fix-infill-format
xsn/server-lib-version-bump
xsn/server-mistral-template
xsn/slot_state_machine_segv
xsn/slot_state_machine
xsn/smollm3_fix_jinja_tmpl
xsn/speed_up_compilation
xsn/split_http_server_context
xsn/split_without_tensor
xsn/tag_based_hf_repo
xsn/temp_fix_httplib
xsn/test_docker_arm
xsn/test_pixtral_fixed_size
xsn/this_tts_test
xsn/tool_call
xsn/typo_gml_glm
xsn/ui_copy_btn
xsn/ultravox
xsn/update_main_docs
xsn/use_repeat_4d
xsn/vision
xsn/vision_2
xsn/voxtral
xsn/wasm_simd
xsn/webui_conv_branching
xsn/webui_fix_numeric_settings
xsn/webui_m_q_params
xsn/webui_max_file_size
xsn/webui_modalities
xsn/webui_pako
xsn/webui_pyodide
xsn/webui_reactjs
xsn/webui_rework_input
xsn/webui_small_misalignment
xsn/win_curl_static
xsn/wllama
xsn/xiaomi_mimo_v2
xsn/xiaomi_mimo
fix test
ngxson
committed
238 days ago
5c039a72
add test
ngxson
committed
238 days ago
3894eb95
update llava/readme
ngxson
committed
238 days ago
d7e95ae8
ah sheet it works
ngxson
committed
238 days ago
c53ecc96
load ok, missing patch merger
ngxson
committed
238 days ago
efce8750
convert ok
ngxson
committed
239 days ago
656759ae
arg : -hf do not fail if url mismatch (#13219)
ngxson
committed
239 days ago
Verified
6f67cf1f
fix typo: `n_ctx_pre_seq` -> `n_ctx_per_seq` (#13221)
ddh0
committed
239 days ago
Verified
16a457fa
convert : improve model arch handling (#13122)
ngxson
committed
239 days ago
Verified
3e168bed
llava : remove duplicate include (#13207)
tattn
committed
239 days ago
Verified
ceda28ef
common : add -jf / --json-schema-file flag (#12011)
ochafik
committed
239 days ago
Verified
3b127c73
vulkan: use uint array index to avoid glslang bug (#13193)
jeffbolznv
committed
239 days ago
Verified
e5007a5e
ggml : fix ppc64le build (#13176)
shalinib-ibm
committed
239 days ago
Verified
41631377
convert : correct typo image_mean --> image_std (#13208)
ngxson
committed
239 days ago
Verified
07c2e2f7
feat(ggml-cpu): enable z17 compile (#13182)
taronaeo
committed
239 days ago
Verified
44cd8d91
arg : allow using -hf offline (#13202)
ngxson
committed
239 days ago
Verified
5933e6fd
docker : do not build tests (#13204)
ngxson
committed
239 days ago
Verified
da84c04d
rpc : fix cache directory initialization (#13188)
hbuxiaofei
committed
239 days ago
Verified
a0f7016d
scripts: n_depth for compare-llama-bench [no ci] (#13201)
JohannesGaessler
committed
240 days ago
Verified
19e899ce
server : Prefilling assistant message in openai compatible API (#13174)
matteoserva
committed
240 days ago
Verified
e2e1ddb9
sampling : when top-k <= 0 -> noop (#13173)
ggerganov
committed
240 days ago
Verified
d9d398f8
llama-bench: fixed size of fields to correctly map to values (#13183)
Alberto Cabrera Pérez
committed
240 days ago
Verified
5a639801
CUDA: fix non-cont. inputs for batched mat mul (#13155)
JohannesGaessler
committed
240 days ago
Verified
cdf76586
llama : llm_type order by size (#13177)
CISC
committed
240 days ago
Verified
7d3af70b
mtmd : add qwen2vl and qwen2.5vl (#13141)
ngxson
committed
240 days ago
Verified
00e3e5a1
llama : set qwen3 model type sizes (#13175)
CISC
committed
240 days ago
Verified
e98b3692
llama-graph : fix text position for mrope (#13159)
ngxson
committed
240 days ago
Verified
b6ce7430
model : Nomic Embed Text V2 with Mixture-of-Experts (MoE) architecture (#12466)
manyoso
committed
241 days ago
Verified
5f5e39e1
clip : fix model size display (#13153)
ngxson
committed
241 days ago
Verified
eaea3253
fix(rpc): Improve input validation and error handling (#13069)
thevilledev
committed
241 days ago
Verified
43ddab6e
Older