DeepSpeed
35b350b2 - Fix quantized-inference & Add generic support of checkpoint loading (#2547)

Commit

3 years ago

Fix quantized-inference & Add generic support of checkpoint loading (#2547) * fix checkpoint loading when it is a dictionary * fix some issues with saving ckpt & int8 inference * fix quantized-inference & add generic support of checkpoint loading * remove int8 hard-coded flag * fix mlp return tensors * fix several issue to load checkpoints of GPT-J, GPT-NEOX, and OPT with different TP-size * add more comments & description for checkpoint-loading module Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com>

References

#2547 - Fix quantized-inference & Add generic support of checkpoint loading

Author

RezaYazdaniAminabadi

Parents

b8416282

DeepSpeed 35b350b2 - Fix quantized-inference & Add generic support of checkpoint loading (#2547)

DeepSpeed
35b350b2 - Fix quantized-inference & Add generic support of checkpoint loading (#2547)