Fix fp16 precision and cos sim correctness (#913)
Summary:
Currently we are using fp32 as the baseline for all correctness tests.
When using `torch.allclose()` to compare the result between fp32 and torchdynamo eager+fp16, the test returns "incorrect"; on the other hand, using cosine similarity, it returns "correct".
Here is an overview of the results:
| | torch.allclose() | cosine similarity |
|---------------|------------------|-------------------|
| fp32 baseline vs. torchdynamo-eager-fp16 | *incorrect* | correct |
| fp16 baseline vs. torchdynamo-eager-fp16 | correct | correct |
Therefore, we are using cosine similarity for all fp16 correctness tests.
Note that this is not a problem of torchdynamo - using `torch.allclose()` to compare fp32 baseline and fp16 result without torchdynamo, the result is also `incorrect`.
Pull Request resolved: https://github.com/pytorch/benchmark/pull/913
Reviewed By: frank-wei
Differential Revision: D36461801
Pulled By: xuzhao9
fbshipit-source-id: d9bf1a759768153d12774acdb5447024fbf33517