DeepSpeed
[Draft] Add On-Policy Distillation (OPSD) Trainer in DeepSpeed
#8027

Open

[Draft] Add On-Policy Distillation (OPSD) Trainer in DeepSpeed #8027

PKUWZP wants to merge 4 commits into deepspeedai:master from PKUWZP:zhipwang_opd_pr

Add OPSD example: config, divergence losses, utils + tests

932f0b52

Add OPSD frozen teacher with CPU logit cache + tests

14d8fe7e

Add OPSD trainer, hybrid-engine rollout, and end-to-end entry point

837787a0

Add OPSD vLLM rollout scaffold, Qwen2/Qwen3 weight bridges, and README

6384396b

PKUWZP requested a review from

tohtana 34 days ago

PKUWZP changed the title ~~Add On-Policy Distillation (OPSD) example app~~ [Draft] Add On-Policy Distillation (OPSD) Trainer in DeepSpeed 34 days ago

chatgpt-codex-connector commented on 2026-05-26

Reviewers

chatgpt-codex-connector

tohtana

Assignees

No one assigned

Labels

None yet

Milestone

No milestone

DeepSpeed [Draft] Add On-Policy Distillation (OPSD) Trainer in DeepSpeed #8027 Open

[Draft] Add On-Policy Distillation (OPSD) Trainer in DeepSpeed #8027

DeepSpeed
[Draft] Add On-Policy Distillation (OPSD) Trainer in DeepSpeed
#8027

Open