DeepSpeed
[CPU] SHM based allreduce improvement for small message size
#5571
Merged

[CPU] SHM based allreduce improvement for small message size #5571

delock
delock add profile for naive all_reduce
58302133
delock add multi parallel copy
fec2c9be
delock alternative multi-parallel memcpy
4c642a1c
delock use double buffer
ed748ce1
delock change naive all reduce to symmetric
e2312ec7
delock clean up
3e4b6c3f
delock don't use coll_begin set in naive_all_reduce
031b8310
delock seperate buffer for different algorithm
2b15c220
delock turn off profile
d5865aa7
delock fix distributed naive allreduce
25882778
delock cleanup
2f694439
delock Remove profiling code
d1b2f098
delock add back original naive_all_reduce
0ba1f07c
delock remove naive_all_reduce
05fc2505
delock cleanup
7bc708d4
delock remove barrier which is not needed
af7d4fab
delock cleanup
b76937cb
delock can handle > 16 rank with efficiency
0da84b6e
delock Remove REPEAT
49c2153b
delock clean up state
a3cc1293
delock fix distributed allreduce perf
7b41d2f2
delock remove unnecessary state change
87accf4b
delock double buffer for distributed_naive_all_reduce
a1ff77e0
delock fix result error
8af61131
delock multiparallel copy #1
00a1c272
delock single omp region multi parallel copy
31b36439
delock add alternaive path
8e5639eb
delock remove multi-memcpy which actually cause perf drop
c0733cbe
delock fix distributed accuracy issue
b7713b69
delock cleanup
3f088e4b
delock delock requested a review from awan-10 awan-10 1 year ago
delock delock requested a review from mrwyattii mrwyattii 1 year ago
delock delock requested a review from arashb arashb 1 year ago
delock fix format
6c7ec551
delock Merge branch 'master' into gma/symmetric_naive_allreduce
dabae15d
delock
tjruwase Merge branch 'master' into gma/symmetric_naive_allreduce
f0634219
tjruwase tjruwase removed review request from arashb arashb 1 year ago
tjruwase tjruwase removed review request from awan-10 awan-10 1 year ago
tjruwase tjruwase removed review request from mrwyattii mrwyattii 1 year ago
tjruwase tjruwase requested a review from adk9 adk9 1 year ago
tjruwase tjruwase requested a review from tjruwase tjruwase 1 year ago
tjruwase
adk9
adk9 requested changes on 2024-06-10
delock Follow comments, remove unneeded codes and syncs.
608cf7c1
delock
adk9
adk9 approved these changes on 2024-06-12
adk9 Merge branch 'master' into gma/symmetric_naive_allreduce
1847a10d
adk9
delock fix format
7f614cb4
delock
adk9 Merge branch 'master' into gma/symmetric_naive_allreduce
e1853f6e
adk9 adk9 merged eda5075b into master 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone