pytorch
af1d4376 - Improve precision and performance for BFloat16 upsampling (#91169)

Commit

1 year ago

Improve precision and performance for BFloat16 upsampling (#91169) ### Description - Fix precision issue for BFloat16 upsampling: https://github.com/pytorch/pytorch/issues/89212 - Improve performance for BFloat16 upsampling. ### Testing data type: BFloat16 - Single core contiguous: mode | scale_factor | shape | before backward / ms | after backward / ms -- | -- | -- | -- | -- nearest | 2 | [10, 3, 200, 200] | 14.47 | 8.34 linear | 2 | [3, 200, 200] | 3.69 | 2.74 bilinear | 2 | [3, 5, 200, 200] | 87.99 | 49.05 trilinear | 2 | [3, 3, 3, 100, 100] | 171.02 | 72.53 bicubic | 2 | [3, 3, 200, 200 ] | 176.29 | 78 channels last: mode | scale_factor | shape | before backward / ms | after backward / ms -- | -- | -- | -- | -- nearest | 2 | [10, 3, 200, 200] | 17.70 | 10.30 linear | 2 | [3, 200, 200] | \ | \ bilinear | 2 | [3, 5, 200, 200] | 50.90 | 18.83 trilinear | 2 | [3, 3, 3, 100, 100] | 121.56 | 42.60 bicubic | 2 | [3, 3, 200, 200 ] | 179.40 | 80 - 20 cores contiguous: mode | scale_factor | shape | before backward / ms | after backward / ms -- | -- | -- | -- | -- nearest | 2 | [10, 3, 200, 200] | 1.17 | 1.01 linear | 2 | [3, 200, 200] | 0.41 | 0.26 bilinear | 2 | [3, 5, 200, 200] | 7.19 | 4.07 trilinear | 2 | [3, 3, 3, 100, 100] | 21.32 | 9.33 bicubic | 2 | [3, 3, 200, 200 ] | 178.67 | 10 channels last: mode | scale_factor | shape | before backward / ms | after backward / ms -- | -- | -- | -- | -- nearest | 2 | [10, 3, 200, 200] | 2.25 | 1.55 linear | 2 | [3, 200, 200] | \ | \ bilinear | 2 | [3, 5, 200, 200] | 20.17 | 7.20 trilinear | 2 | [3, 3, 3, 100, 100] | 43.33 | 15.66 bicubic | 2 | [3, 3, 200, 200 ] | 176.76 | 10 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91169 Approved by: https://github.com/jgong5, https://github.com/mingfeima, https://github.com/Skylion007

Author

CaoE

Committer

pytorchmergebot

Parents

040d2cc9

pytorch af1d4376 - Improve precision and performance for BFloat16 upsampling (#91169)

pytorch
af1d4376 - Improve precision and performance for BFloat16 upsampling (#91169)