llama.cpp
75917152 - server: add --models-memory-max parameter to allow dynamically unloading models when they exceed a memory size threshold

Commit

6 days ago

server: add --models-memory-max parameter to allow dynamically unloading models when they exceed a memory size threshold

References

Author

0cc4m

Committer

0cc4m

Parents