Fix tensor indexing crash in serve generate_response KV cache continuation (#44735)
The `generate_response` method indexes `inputs` as a dict
(`inputs["input_ids"]`) but `inputs` is already the raw `input_ids`
tensor at that point. This causes a TypeError on the second request
in a conversation session when KV cache reuse is attempted.
Use `inputs.shape[-1]` instead, matching `generate_response_non_streaming`.
Fixes #44734
Co-authored-by: easonysliu <easonysliu@tencent.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>