transformers
56af8df3 - HF <-> megatron checkpoint reshaping and conversion for GPT (#19317)

Commit
3 years ago
HF <-> megatron checkpoint reshaping and conversion for GPT (#19317) * HF <-> megatron checkpoint conversion handling reshaping from different tensor and parallel sizes * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * addressing comments * add doc strings and 🐛 fixes Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Author
Parents
Loading