llama-cpp-python
This PR implements the previously stubbed state management methods in the _internals.py module and updates the corresponding API calls in llama.py to use the correct underlying C++ function names.
#2134

Open

This PR implements the previously stubbed state management methods in the _internals.py module and updates the corresponding API calls in llama.py to use the correct underlying C++ function names. #2134

bsides230 wants to merge 6 commits into abetlen:main from bsides230:kv-caching-issue

feat: update llama.cpp submodule and bindings for Qwen 3.5 support

d21ef679

fix: set BUILD_NUMBER and LLAMA_INSTALL_VERSION for mtmd build

eacc2584

fix: return bool from kv_cache_seq_rm for partial removal detection

01248477

fix: handle GDN hybrid models that reject partial memory removal

47aedc22

Update llama.cpp submodule to latest ggml-org

2ee9d3d3

This PR implements the previously stubbed state management methods in…

1a4f7589

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Milestone

No milestone