llama.cpp
server: introduce API for serving / loading / unloading multiple models
#17470
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
171
Changes
View On
GitHub
server: introduce API for serving / loading / unloading multiple models
#17470
ngxson
merged 171 commits into
ggml-org:master
from
ngxson:xsn/server_model_management_v1_2
server: add model management and proxy
fc5901a4
fix compile error
399f536d
does this fix windows?
abc0ca47
fix windows build
54b35457
use subprocess.h, better logging
5423d42a
add test
0ef3b61e
fix windows
7c6eb17f
Merge branch 'master' into xsn/server_model_management_v1_2
919d3f8c
feat: Model/Router server architecture WIP
55d33a8b
more stable
b9ebdf61
fix unsafe pointer
6610724f
also allow terminate loading model
d0ea9e08
add is_active()
5805ca79
refactor: Architecture improvements
8a885768
Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2…
c35dee3b
tmp apply upstream fix
21614086
address most problems
5369aaa1
address thread safety issue
6929c9f4
address review comment
be25bccd
add docs (first version)
cd5c6993
address review comment
a2e912cf
feat: Improved UX for model information, modality interactions etc
4bf82a10
chore: update webui build output
cc88f6a7
Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2…
45bf2a49
refactor: Use only the message data `model` property for displaying m…
049f40df
chore: update webui build output
c26c3402
add --models-dir param
032b9ff4
feat: New Model Selection UX WIP
8b1d9675
chore: update webui build output
6b7c0a50
feat: Add auto-mic setting
69503aa5
feat: Attachments UX improvements
92585c71
implement LRU
62ee883d
remove default model path
7cd92907
better --models-dir
72415588
add env for args
b0540e8e
address review comments
525e2746
fix compile
457fbdac
refactor: Chat Form Submit component
c274f132
Merge branch 'master' into xsn/server_model_management_v1_2
f2ca54b2
ad endpoint docs
d32bbfec
Merge remote-tracking branch 'webui/allozaur/server_model_management_…
4af1b6cb
feat: Add copy to clipboard to model name in model info dialog
076eec6d
feat: Model unavailable UI state for model selector
db8ed5df
feat: Chat Form Actions UI logic improvements
dc913ec4
feat: Auto-select model from last assistant response
a39ef24c
chore: update webui build output
036cc939
Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2…
6282537a
expose args and exit_code in API
f25bfaba
add note
7ef6312f
support extra_args on loading model
f927e21f
allow reusing args if auto_load
74685f41
typo docs
f95f9c51
oai-compat /models endpoint
2e355c7f
cleaner
5ad594e6
address review comments
d65be917
feat: Use `model` property for displaying the `repo/model-name` namin…
1f0cb3ab
refactor: Attachments data
b7ba13b6
chore: update webui build output
48dbef17
refactor: Enum imports
1c214e9a
feat: Improve Model Selector responsiveness
ef5f9d07
chore: update webui build output
49c8062d
refactor: Cleanup
d5a6671b
refactor: Cleanup
f8ff39c6
refactor: Formatters
41764b8f
chore: update webui build output
219fd19e
refactor: Copy To Clipboard Icon component
e92ce079
chore: update webui build output
fb5445e9
refactor: Cleanup
39fb1c2b
chore: update webui build output
188d3236
refactor: UI badges
16747dee
chore: update webui build output
e808f2b2
Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2…
76557cd5
refactor: Cleanup
13fe8607
refactor: Cleanup
b2590a7f
chore: update webui build output
5ef3f990
add --models-allow-extra-args for security
6ed192b4
nits
2c6b58f7
add stdin_file
539cbf00
Merge branch 'master' into xsn/server_model_management_v1_2
399b39f2
fix merge
e514b86d
ngxson
requested a review
from
ggerganov
209 days ago
ngxson
requested a review
from
allozaur
209 days ago
Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2…
11c26ecf
fix: Retrieve lost setting after resolving merge conflict
7db3d874
github-actions
added
script
github-actions
added
testing
github-actions
added
examples
github-actions
added
python
github-actions
added
server
ngxson
commented on 2025-11-24
refactor: DatabaseStore -> DatabaseService
ccd6c271
refactor: Database, Conversations & Chat services + stores architectu…
fed6c82e
refactor: Remove redundant settings
f9c911d0
refactor: Multi-model business logic WIP
501badc9
chore: update webui build output
4c24ead8
feat: Switching models logic for ChatForm or when regenerating messge…
b9a3129d
chore: update webui build output
01324493
fix: Add `untrack` inside chat processing info data logic to prevent …
82975a1f
fix: Regenerate
33356f36
feat: Remove redundant settigns + rearrange
c680083c
fix: Audio attachments
5207527e
refactor: Icons
22507fed
chore: update webui build output
81b8e1ab
feat: Model management and selection features WIP
2a280b60
chore: update webui build output
19e5385b
refactor: Improve server properties management
b1cf8bb8
refactor: Icons
23a91cd2
chore: update webui build output
d0d7a88d
feat: Improve model loading/unloading status updates
284557cd
chore: update webui build output
9431f358
refactor: Improve API header management via utility functions
ddf98bdf
remove support for extra args
e40f35fb
set hf_repo/docker_repo as model alias when posible
e2731c37
Merge branch 'master' into xsn/server_model_management_v1_2
becc6026
refactor: Remove ConversationsService
42483f46
refactor: Chat requests abort handling
456828b3
refactor: Server store
d6ee3d13
tmp webui build
1493ee09
refactor: Model modality handling
13e79884
chore: update webui build output
2a5922b1
refactor: Processing state reactivity
6b95118a
fix: UI
69065ddc
refactor: Services/Stores syntax + logic improvements
6a3d6e79
Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2…
78ead498
refactor: Architecture cleanup
d7335373
feat: Improve statistic badges
9086bc30
feat: Condition available models based on modality + better model loa…
db479523
docs: Architecture documentation
bc577266
Merge branch 'master' into xsn/server_model_management_v1_2
bdaf44a1
feat: Update logic for PDF as Image
491fe2d3
add TODO for http client
7be833da
refactor: Enhance model info and attachment handling
eed1bd9b
chore: update webui build output
3470b12b
refactor: Components naming
5fadd0fe
chore: update webui build output
04ef4a06
refactor: Cleanup
1cf5daa8
refactor: DRY `getAttachmentDisplayItems` function + fix UI
68b653ef
chore: update webui build output
171a0926
fix: Modality detection improvement for text-based PDF attachments
dd30810d
refactor: Cleanup
1adf173d
docs: Add info comment
2f97dbfa
refactor: Cleanup
c76de5e0
re
4d16459b
refactor: Cleanup
f50ce7b5
refactor: Cleanup
d49d97c6
feat: Attachment logic & UI improvements
648d2dee
refactor: Constants
27b15226
feat: Improve UI sidebar background color
2464e060
chore: update webui build output
ce9c9afe
refactor: Utils imports + move types to `app.d.ts`
493ef087
test: Fix Storybook mocks
2d556bb9
chore: update webui build output
a568e74c
Merge branch 'master' into allozaur/server_model_management_v1_2
33b9cc40
test: Update Chat Form UI tests
4f39da82
refactor: Tooltip Provider from core layout
949b5fd6
refactor: Tests to separate location
ae8a1e81
Merge remote-tracking branch 'origin/allozaur/server_model_management…
6fd720e7
Merge branch 'master' into xsn/server_model_management_v1_2
c1dfccd0
decouple server_models from server_routes
a82dbbfb
test: Move demo test to tests/server
360a5ed6
refactor: Remove redundant method
acd3c581
chore: update webui build output
e8b9d74b
also route anthropic endpoints
23cb4113
Merge remote-tracking branch 'webui/allozaur/server_model_management_…
802e77ea
fix duplicated arg
7b28b5e1
fix invalid ptr to shutdown_handler
4a1c05c3
server : minor
d182544c
ggerganov
approved these changes on 2025-12-01
rm unused fn
f2dbe9c0
add ?autoload=true|false query param
c3304075
Merge branch 'master' into xsn/server_model_management_v1_2
05cc22f0
refactor: Remove redundant code
689ca09b
Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2…
7a95348d
docs: Update README documentations + architecture & data flow diagrams
73056fb6
fix: Disable autoload on calling server props for the model
c49467a3
chore: update webui build output
9d3b718e
fix ubuntu build
a6d3f83e
danbev
commented on 2025-12-01
fix: Model status reactivity
b926cfa3
fix: Modality detection for MODEL mode
01ed8ced
chore: update webui build output
b10d9508
l2k36hk
dismissed these changes on 2025-12-01
allozaur
dismissed their stale review
202 days ago
Not a review from maintainer
allozaur
approved these changes on 2025-12-01
ngxson
merged
ec18edfc
into master
202 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
allozaur
ggerganov
danbev
angt
l2k36hk
Assignees
No one assigned
Labels
script
testing
examples
python
server
Milestone
No milestone