langchain
d3d23e23 - fix(anthropic): streaming token counting to defer input tokens until completion (#32518)

Commit
147 days ago
fix(anthropic): streaming token counting to defer input tokens until completion (#32518) Supersedes #32461 Fixed incorrect input token reporting during streaming when tools are used. Previously, input tokens were counted at `message_start` before tool execution, leading to inaccurate counts. Now input tokens are properly deferred until `message_delta` (completion), aligning with Anthropic's billing model and SDK expectations. **Before Fix:** - Streaming with tools: Input tokens = 0 ❌ - Non-streaming with tools: Input tokens = 472 ✅ **After Fix:** - Streaming with tools: Input tokens = 472 ✅ - Non-streaming with tools: Input tokens = 472 ✅ Aligns with Anthropic's SDK expectations. The SDK handles input token updates in `message_delta` events: ```python # https://github.com/anthropics/anthropic-sdk-python/blob/main/src/anthropic/lib/streaming/_messages.py if event.usage.input_tokens is not None: current_snapshot.usage.input_tokens = event.usage.input_tokens ```
Author
Parents
Loading