DeepSpeed
837787a0 - Add OPSD trainer, hybrid-engine rollout, and end-to-end entry point

Commit
39 days ago
Add OPSD trainer, hybrid-engine rollout, and end-to-end entry point Lands the fully-runnable hybrid-engine training path: a backend-agnostic RolloutEngine ABC with RolloutRequest / RolloutBatch / SamplingConfig dataclasses, a HybridEngineRollout implementation that uses DeepSpeed's accelerated decode when an inference policy exists and otherwise falls back to GatheredParameters + the raw HF generate (covers Qwen-family and other models not in DeepSpeed's inference container list), a left-padded prompt dataset + collator, a three-phase trainer loop (rollout -> teacher forward + cache -> student forward + streamed KL + backward + step), the argparse + deepspeed.initialize entry point, base DeepSpeed ZeRO-3 + hybrid_engine JSON configs, a 5-step smoke config and launcher script, and a 20-prompt math toy dataset for the smoke run. Smoke-validated end-to-end on 2x H200 with Qwen2.5-0.5B-Instruct student and Qwen2.5-1.5B-Instruct teacher; loss finite for 5 steps. Rollout interface contract is covered by tests/test_rollout_interface.py. Signed-off-by: Zhipeng Wang <zhipengbayern@gmail.com>
Author
Committer
Parents
Loading