merge main (#331)
* Reshape deepspeed checkpoint (#239)
* Reshape deepspeed checkpoint
* add checkpoint tests
* Validate input folder
* Tests for tp/pp reshape
* remove debug folders
* fix test_checkpoint_reshaping_empty_dir
* Fix unit tests
* Remove deepspeed checkpoint utils
* Use DS 3D reshaping utils
* convert to bf16
* wip universal chkpt
* rename
* rename
* wip on fragments dealing
* cleanup
* Loading universal checkpoint with reshaping
* all gpu1<->2 reshapes work
* param attrs
* make the tests adaptable to the number of available gpus
* WIP
* WIP
* WIP
* WIP
* Debug functions
* args should be required, don't create another latest file
* Parallelize shard extraction
* close+join pool; add tqdm; comment out noise
* rename
* parameterize
* Parallel slice merging
* Cleanup
* allow inspection on a machine w/o gpus
* test against the right DS branch
* DS size was merged
Co-authored-by: Stas Bekman <stas@stason.org>
* BLOOM Inference via DeepSpeed-Inference, Accelerate and DeepSpeed-ZeRO (#308)
* hardcode the dtype depending on the model
* change the mp based on the world_size
* remove hardcoded world_size
* add bigscience/bigscience-small-testing
* fixes
* add zero-inference script
* fixes
* fix
* working script
* renames
* fixes
* fix for offline use
* add benchmark
* add benchmark
* update
* cleanup
* update
* msecs
* cleanup
* improve
* fix benchmark, add warmup
* update
* fix; thanks Michael Wyatt
* clarify
* add bloom batch-inference script
* removed the names :-)
* fold the bs functionality from the other script
* fix
* restore do_sample
* dump generate args
* fix
* fix
* support any batchsize
* div by bs
* mul by bs
* add cpu_offload; sync scripts
* wip
* improvements
* fixes
* fixes
* add accelerate script
* fix
* wip
* wip
* stats
* add OnDevice and remove zero-inference (#316)
* wip
* rework generate + benchmark
* figure out the memory map dynamically
* bug fix
* fix ds-zero-inference wrt device
* bug fix
* update
* update
* fix
Co-authored-by: Reza Yazdani <reyazda@microsoft.com>
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Co-authored-by: Stas Bekman <stas@stason.org>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Reza Yazdani <reyazda@microsoft.com>
Co-authored-by: Jeff Rasley <jerasley@microsoft.com>