SemanticDiff

pytorch
efd20de2 - fix multihead attention for half (#21658)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

5 years ago

fix multihead attention for half (#21658) Summary: Currently multihead attention for half type is broken ``` File "/home/ngimel/pytorch/torch/nn/functional.py", line 3279, in multi_head_attention_forward attn_output = torch.bmm(attn_output_weights, v) RuntimeError: Expected object of scalar type Float but got scalar type Half for argument https://github.com/pytorch/pytorch/issues/2 'mat2' ``` because softmax converts half inputs into fp32 inputs. This is unnecessary - all the computations in softmax will be done in fp32 anyway, and the results need to be converted into fp16 for the subsequent batch matrix multiply, so nothing is gained by writing them out in fp32. This PR gets rid of type casting in softmax, so that half works. Pull Request resolved: https://github.com/pytorch/pytorch/pull/21658 Differential Revision: D15807487 Pulled By: zhangguanheng66 fbshipit-source-id: 4709ec71a36383d0d35a8f01021e12e22b94992d

Author

Natalia Gimelshein

Committer

facebook-github-bot

facebook-github-bot

Parents

FAQ Terms Privacy Refunds Impressum

Loading