SemanticDiff

pytorch
28dc02fe - Accumulate 16-bit float sums in 32-bit accumulators (#60387)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

3 years ago

Accumulate 16-bit float sums in 32-bit accumulators (#60387) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/60387 Fixes gh-59489 Using 32-bit accumulators is a win-win: improved precision and improved performance since the half precision types needed to be converted back and forth to 32-bit float to do the arithmetic anyway. Note that on multi-threaded or dis-contiguous sums, there can be partial sums stored in the output so they are necessarily trucated to 16-bit. Fixing this would require a rework of TensorIterator reductions. Test Plan: Imported from OSS Reviewed By: jbschlosser Differential Revision: D29447187 Pulled By: ngimel fbshipit-source-id: d0619e0ca2fe116d101460142b79ca56fd6d0840

Author

peterbell10

peterbell10

Committer

facebook-github-bot

facebook-github-bot

Parents

FAQ Terms Privacy Refunds Impressum

Loading