Megatron-DeepSpeed
3932c749 - BLOOM Inference via DeepSpeed-Inference, Accelerate and DeepSpeed-ZeRO (#308)

Commit

3 years ago

BLOOM Inference via DeepSpeed-Inference, Accelerate and DeepSpeed-ZeRO (#308) * hardcode the dtype depending on the model * change the mp based on the world_size * remove hardcoded world_size * add bigscience/bigscience-small-testing * fixes * add zero-inference script * fixes * fix * working script * renames * fixes * fix for offline use * add benchmark * add benchmark * update * cleanup * update * msecs * cleanup * improve * fix benchmark, add warmup * update * fix; thanks Michael Wyatt * clarify * add bloom batch-inference script * removed the names :-) * fold the bs functionality from the other script * fix * restore do_sample * dump generate args * fix * fix * support any batchsize * div by bs * mul by bs * add cpu_offload; sync scripts * wip * improvements * fixes * fixes * add accelerate script * fix * wip * wip * stats * add OnDevice and remove zero-inference (#316) * wip * rework generate + benchmark * figure out the memory map dynamically * bug fix * fix ds-zero-inference wrt device * bug fix * update * update * fix Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>

References

#308 - BLOOM Inference via DeepSpeed-Inference, Accelerate and DeepSpeed-ZeRO

Author

stas00

Parents

0f23a729

Megatron-DeepSpeed 3932c749 - BLOOM Inference via DeepSpeed-Inference, Accelerate and DeepSpeed-ZeRO (#308)

Megatron-DeepSpeed
3932c749 - BLOOM Inference via DeepSpeed-Inference, Accelerate and DeepSpeed-ZeRO (#308)