[Doc] MoE routing capture and replay recipe (#44925)
* inital protoype
* remove unneeded selected_experts
* Revert MoE expert replay; document pattern via monkey patching
Replace the intrusive record/replay implementation across modeling files
with a documentation-only guide. All three pieces — the replayable router
subclass, the replay context manager, and the runtime OutputRecorder
registration — can be built on top of the existing monkey_patching and
output_capturing APIs without touching core MoE modeling code.
Also shows the one-line conversion from vLLM's CompletionOutput.routed_experts
numpy array to the per-layer tuple this pattern expects, enabling RLHF
workflows that generate with vLLM and train with transformers.
* Preserve unrelated forward-progress in __init__.py and generic.py
The previous revert commit accidentally rolled back unrelated work on
these two files — version bump, TorchvisionBackend addition, and
module-alias refactor. Restore those while keeping the MoE-specific
additions (MoERouting export, output_moe_routing kwarg) removed.