transformers
55fc5dc9 - Fix left-padding position_ids in _static_sample for batched generation

Commit
76 days ago
Fix left-padding position_ids in _static_sample for batched generation Without this, _static_sample sets position_ids = cache_position for all batch elements, which is only correct when there is no left-padding. With left-padded batches (required for decoder-only batched generation), different batch elements have different padding amounts. Compute position_offset = prefill_len - attention_mask.sum() once before the loop, then position_ids = cache_position - position_offset each step. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Author
Committer
Parents
Loading