ZeRO1: Add bucketting logic to control the size of tensors for all-gather/reduce-scatter (#6025)
Co-authored-by: Rahul Solanki <rhsoln@amazon.com>
Co-authored-by: guangtai <guangtai@amazon.com>
Co-authored-by: Amithrajith Mamidala <amithrm@amazon.com>