transformers
00ab75e6 - fix(benchmarks): correct sdpa_backend inconsistency and attn_implementation for continuous batching (#42339)

Commit

23 days ago

fix(benchmarks): correct sdpa_backend inconsistency and attn_implementation for continuous batching (#42339) This commit fixes two bugs in BenchmarkConfig reported in issue #42211: 1. **sdpa_backend inconsistency (line 105)**: The warning message states "sdpa_backend must be None" but the code was setting it to "math". Changed to None to match the warning message. This allows PyTorch to auto-select the appropriate SDPA backend rather than forcing one globally, which is correct for continuous batching with custom attention masks. 2. **Invalid attn_implementation (line 243)**: Changed from "paged|sdpa" to "sdpa". Using "paged|sdpa" directly bypassed the validation logic at lines 91-105 since it only checks for exactly "sdpa". The "paged|" prefix is automatically added by init_continuous_batching() in continuous_api.py, so the config should use plain "sdpa" for consistency with other configs. Both bugs were introduced in commit 069684ef87 (PR #41916). Fixes #42211

References

#42339 - fix(benchmarks): correct sdpa_backend inconsistency and attn_implementation for continuous batching

Author

engmohamedsalah

Parents

3410ba9b

transformers 00ab75e6 - fix(benchmarks): correct sdpa_backend inconsistency and attn_implementation for continuous batching (#42339)

transformers
00ab75e6 - fix(benchmarks): correct sdpa_backend inconsistency and attn_implementation for continuous batching (#42339)