Fix false positive right-padding warning for decoder-only models in pipeline (#44021)
* Fix false positive right-padding warning for decoder-only models
Two changes to fix the spurious 'right-padding was detected' warning
that fires for Qwen3 and other models during batched pipeline inference:
1. TextGenerationPipeline: Set padding_side='left' automatically for
decoder-only models. The default tokenizer padding_side is 'right',
which causes incorrect padding for batched generation. The pipeline
now overrides this to 'left' on initialization.
2. GenerationMixin.generate: Improve right-padding detection by using
the attention_mask when available, instead of only checking if the
last token equals pad_token_id. The old heuristic produced false
positives when pad_token_id == eos_token_id or bos_token_id (as is
the case for Qwen3 where both are token 151643).
Fixes #43906
Related to #38071
* Fix padding_side conflict when feature_extractor is present (e.g., WhisperForCausalLM)
Only set tokenizer.padding_side='left' when no feature_extractor exists,
to avoid ValueError from pad_collate_fn when they disagree.
* fix the test itself, pipe doesn't need feat extractors
---------
Co-authored-by: ManasVardhan <manasvardhan@users.noreply.github.com>
Co-authored-by: raushan <raushan@huggingface.co>