Fix GPT-2 no-past attention fusion for transformers >= 4.27 (#27449)
## Summary
- Fix `FusionGptAttentionNoPast` mask pattern matching to support both
`torch.uint8` (old) and `torch.bool` (new) causal masks
- Add synthetic ONNX graph generator and unit test for the no-past
attention fusion path
## Motivation
Fixes #16453
In `transformers >= 4.27` (Feb 2023), the causal attention mask dtype
changed from `torch.uint8` to `torch.bool`
([commit](https://github.com/huggingface/transformers/commit/c51dc4f92755c67a83f3fc8a0bd6b3e64df199e4)).
This removed a `Cast` node from the exported ONNX graph.
`FusionGptAttentionNoPast.fuse()` hardcoded `Cast` as the first element
in `match_parent_path`, causing the mask path match to fail silently for
all modern transformers exports. The result: **zero Attention nodes
fused** for any GPT-2 model exported without past state.
The sibling class `FusionGptAttention` (with-past) was already fixed to
handle both patterns using `match_parent_paths` (plural). This PR
applies the same approach to the no-past variant.
## Changes
### `fusion_gpt_attention_no_past.py`
- Replace `match_parent_path` with `match_parent_paths` for the
Where-based mask path (lines 187-201), offering both the Cast-prefixed
pattern (old transformers) and Cast-less pattern (transformers >= 4.27)
- Remove stale TODO comment that noted the fusion "stopped working"
### `gpt2_model_generator.py`
- Add `create_gpt2_attention_no_past()` function that builds a synthetic
GPT-2 no-past attention graph with the Where-based mask pattern
- Supports `add_cast` parameter to test both mask variants
### `test_attention_fusion.py`
- Add `test_gpt2_attention_no_past_fusion()` that verifies an Attention
node is fused for all combinations of `add_cast` and `switch_add_inputs`
## Test Plan
- [x] New test `test_gpt2_attention_no_past_fusion` passes (4 variants:
with/without Cast × normal/switched Add inputs)
- [x] All existing attention fusion tests pass (10/10)
- [x] Lint clean on modified files (`lintrunner` reports no issues for
new code)
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>