DeepSpeed
79ff1627 - Update to new torch grad hook API: BF16Optimizer and Stage2 (#7189)

Commit
263 days ago
Update to new torch grad hook API: BF16Optimizer and Stage2 (#7189) This commit extends PR [#6773](https://github.com/deepspeedai/DeepSpeed/pull/6773) to ZerO2 as well as BF16Optimizer. Starting PyTorch 2.1 there is a new and robust hook API on a param itself: `param.register_post_accumulate_grad_hook()` A proper API is automatically selected depending on the PyTorch version. --------- Signed-off-by: Max Kovalenko <mkovalenko@habana.ai> Signed-off-by: Masahiro Tanaka <mtanaka@microsoft.com> Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: Logan Adams <loadams@microsoft.com> Signed-off-by: c8ef <c8ef@outlook.com> Signed-off-by: Hongwei <hongweichen@microsoft.com> Signed-off-by: Bruno Magalhaes <bruno.magalhaes@synthesia.io> Signed-off-by: Hongwei Chen <hongweichen@microsoft.com> Signed-off-by: Shaik Raza Sikander <srsikander@habana.ai> Signed-off-by: shaomin <wukon1992@gmail.com> Signed-off-by: Stas Bekman <stas@stason.org> Signed-off-by: siqi <siqi@tecorigin.com> Signed-off-by: Wei Wu <wuwei211x@gmail.com> Signed-off-by: ShellyNR <shelly.nahir@live.biu.ac.il> Signed-off-by: Lai, Yejing <yejing.lai@intel.com> Signed-off-by: Liang Cheng <astarxp777@gmail.com> Signed-off-by: A-transformer <astarxp777@gmail.com> Signed-off-by: yueyang.hyy <yueyang.hyy@alibaba-inc.com> Co-authored-by: Masahiro Tanaka <mtanaka@microsoft.com> Co-authored-by: inkcherry <mingzhi.liu@intel.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Logan Adams <loadams@microsoft.com> Co-authored-by: Masahiro Tanaka <81312776+tohtana@users.noreply.github.com> Co-authored-by: Connector Switch <c8ef@outlook.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com> Co-authored-by: Hongwei Chen <33092912+hwchen2017@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Bruno Magalhaes <bruno.magalhaes@synthesia.io> Co-authored-by: loadams <loadams@users.noreply.github.com> Co-authored-by: A-transformer <cl5743590921@gmail.com> Co-authored-by: Raza Sikander <srsikander@habana.ai> Co-authored-by: wukong1992 <wukong1992@users.noreply.github.com> Co-authored-by: shaomin <wukon1992@gmail.com> Co-authored-by: siqi654321 <siqi202311@163.com> Co-authored-by: siqi <siqi@tecorigin.com> Co-authored-by: Wei Wu <45323446+U-rara@users.noreply.github.com> Co-authored-by: Shelly Nahir <73890534+ShellyNR@users.noreply.github.com> Co-authored-by: snahir <snahir@habana.ai> Co-authored-by: Yejing-Lai <yejing.lai@intel.com> Co-authored-by: A-transformer <astarxp777@gmail.com> Co-authored-by: Ma, Guokai <guokai.ma@gmail.com> Co-authored-by: Glaceon-Hyy <ffheyy0017@gmail.com>
Author
Parents
Loading