server : fix cache reuse logic (#12161)

Commit

218 days ago

server : fix cache reuse logic (#12161) The first kv shift offsets the positions of all tokens after head_c. When using llama_kv_cache_seq_rm next, using head_c will remove the valid tokens because their positions have already been offset.