[Draft] Add On-Policy Distillation (OPSD) Trainer in DeepSpeed #8027
Add OPSD example: config, divergence losses, utils + tests
932f0b52
Add OPSD frozen teacher with CPU logit cache + tests
14d8fe7e
Add OPSD trainer, hybrid-engine rollout, and end-to-end entry point
837787a0
Add OPSD vLLM rollout scaffold, Qwen2/Qwen3 weight bridges, and README
6384396b
PKUWZP
changed the title Add On-Policy Distillation (OPSD) example app [Draft] Add On-Policy Distillation (OPSD) Trainer in DeepSpeed 34 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub