fix: restore `_attn_implementation `and fix request offset in `generate_batch()` (#45943)
* fix: restore _attn_implementation and fix request offset in generate_batch()
* Lower the attn switch to the innermost scope and robust request reordering
* inference mode to no grad
---------
Co-authored-by: remi-or <remi.pierre_o@orange.fr>