DeepSpeed
8ad00250 - 2 workarounds for tests to work w. newer PP

Commit
4 years ago
2 workarounds for tests to work w. newer PP 1. GPT2ModelPipe->MockGPT2ModelPipe, due to hacks associated with GPT2ModelPipe name 2. Hardcode attn mask in mock gpt model pipe, newer PP requires stashing attn mask to get around issues with bool dtypes.
Author
Parents
Loading