llama.cpp
78433f60
- Fix recurrent state serialization for partial reads and writes (#22362)
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Commit
View On
GitHub
Commit
22 days ago
Fix recurrent state serialization for partial reads and writes (#22362) The previous code worked only for full tensor reads and writes and was hitting `GGML_ASSERT(size == ggml_nbytes(tensor)); ` assert when tested with llama-server.
References
#22362 - [Tensor Parallel] Fix recurrent state serialization for partial reads and writes
Author
gaugarg-nv
Parents
7ec36aa8
Loading