Commit
3 years ago
merge main (#331) * Reshape deepspeed checkpoint (#239) * Reshape deepspeed checkpoint * add checkpoint tests * Validate input folder * Tests for tp/pp reshape * remove debug folders * fix test_checkpoint_reshaping_empty_dir * Fix unit tests * Remove deepspeed checkpoint utils * Use DS 3D reshaping utils * convert to bf16 * wip universal chkpt * rename * rename * wip on fragments dealing * cleanup * Loading universal checkpoint with reshaping * all gpu1<->2 reshapes work * param attrs * make the tests adaptable to the number of available gpus * WIP * WIP * WIP * WIP * Debug functions * args should be required, don't create another latest file * Parallelize shard extraction * close+join pool; add tqdm; comment out noise * rename * parameterize * Parallel slice merging * Cleanup * allow inspection on a machine w/o gpus * test against the right DS branch * DS size was merged Co-authored-by: Stas Bekman <stas@stason.org> * BLOOM Inference via DeepSpeed-Inference, Accelerate and DeepSpeed-ZeRO (#308) * hardcode the dtype depending on the model * change the mp based on the world_size * remove hardcoded world_size * add bigscience/bigscience-small-testing * fixes * add zero-inference script * fixes * fix * working script * renames * fixes * fix for offline use * add benchmark * add benchmark * update * cleanup * update * msecs * cleanup * improve * fix benchmark, add warmup * update * fix; thanks Michael Wyatt * clarify * add bloom batch-inference script * removed the names :-) * fold the bs functionality from the other script * fix * restore do_sample * dump generate args * fix * fix * support any batchsize * div by bs * mul by bs * add cpu_offload; sync scripts * wip * improvements * fixes * fixes * add accelerate script * fix * wip * wip * stats * add OnDevice and remove zero-inference (#316) * wip * rework generate + benchmark * figure out the memory map dynamically * bug fix * fix ds-zero-inference wrt device * bug fix * update * update * fix Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Stas Bekman <stas@stason.org> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Reza Yazdani <reyazda@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Author
Parents
Loading