llama.cpp
78433f60 - Fix recurrent state serialization for partial reads and writes (#22362)

Commit
22 days ago
Fix recurrent state serialization for partial reads and writes (#22362) The previous code worked only for full tensor reads and writes and was hitting `GGML_ASSERT(size == ggml_nbytes(tensor)); ` assert when tested with llama-server.
Author
Parents
Loading