Fix IndexError with DeepSpeed ZeRO-3 when kernels rotary is active (#45414)
* Fix `IndexError: pop from an empty deque` under DeepSpeed ZeRO-3
When `kernels` is installed, `@use_kernelized_func` attaches a
`rotary_fn` child `nn.Module` to attention layers. DeepSpeed ZeRO-3's
parameter coordinator traces the module graph at init and expects
every registered submodule to be invoked during forward. The model's
forward still calls the plain Python `apply_rotary_pos_emb`, so
`rotary_fn` is never executed and the trace desynchronizes, raising
`IndexError: pop from an empty deque` on the second forward.
Skip attaching the kernelized submodule when ZeRO-3 is enabled; users
running under ZeRO-3 fall back to the Python implementation, which is
what they were getting before #41147.
Fixes #45137
* Add dates to new model cards to satisfy check-repository-consistency