DeepSpeed
tweaks to ds-attn, distilbert policy, and mup
#2649
Merged

tweaks to ds-attn, distilbert policy, and mup #2649

jeffra merged 3 commits into master from jeffra/ds-attn-tweaks
jeffra
jeffra tweaks to ds-attn, distilbert policy, mup, etc.
d0d28bb5
jeffra jeffra requested a review from RezaYazdaniAminabadi RezaYazdaniAminabadi 2 years ago
jeffra jeffra requested a review from mrwyattii mrwyattii 2 years ago
jeffra jeffra requested a review from awan-10 awan-10 2 years ago
jeffra jeffra requested a review from cmikeh2 cmikeh2 2 years ago
jeffra jeffra requested a review from arashb arashb 2 years ago
jeffra jeffra changed the title tweaks to ds-attn, distilbert policy, mup, etc. tweaks to ds-attn, distilbert policy, and mup 2 years ago
jeffra fixes for distilbert
c49b10cd
jeffra adjust kwargs to match position args for bert (hack)
da81b59e
cmikeh2
cmikeh2 approved these changes on 2022-12-28
jeffra jeffra merged d9b788d7 into master 2 years ago
jeffra jeffra deleted the jeffra/ds-attn-tweaks branch 2 years ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone