pytorch
fbafcecf - [optim][radam] group tensors in foreach to maximize perf (#92365)

Commit
1 year ago
[optim][radam] group tensors in foreach to maximize perf (#92365) Also noticed that eps is not being used nor tested at all for the mta impl of RAdam. Will fix in a followup PR before turning foreach to default! Pull Request resolved: https://github.com/pytorch/pytorch/pull/92365 Approved by: https://github.com/albanD
Author
Committer
Parents
Loading