change LayoutLMForSequenceClassification inference accuracy tolerance (#129728)
Summary:
Fixes #128510.
https://github.com/pytorch/pytorch/pull/124451 makes LayoutLMForSequenceClassification hit the SDPA pattern 1 and then encounter the accuracy issue. The issue only happens with BF16 inference single thread. This PR tends to increase the model tolerance and make the check pass. Note that even the math-version SDPA could have the issue because of some small implementation diff.
The test log:
Single thread
```
correct_result: SequenceClassifierOutput(loss=tensor(0.5998), logits=tensor([[0.3301, 0.1338]], dtype=torch.bfloat16), hidden_states=None, attentions=None)
new_result: SequenceClassifierOutput(loss=tensor(0.6016), logits=tensor([[0.3281, 0.1357]], dtype=torch.bfloat16), hidden_states=None, attentions=None)
E0627 01:09:16.762789 140281313759104 torch/_dynamo/utils.py:1476] RMSE (res-fp64): 0.00151, (ref-fp64): 0.00046 and shape=torch.Size([1, 2]). res.dtype: torch.bfloat16, multiplier: 3.000000, tol: 0.001000
E0627 01:09:16.762972 140281313759104 torch/_dynamo/utils.py:1390] Accuracy failed for key name logits
fail_accuracy
```
Multiple threads
```
correct_result: SequenceClassifierOutput(loss=tensor(0.6007), logits=tensor([[0.3301, 0.1357]], dtype=torch.bfloat16), hidden_states=None, attentions=None)
new_result: SequenceClassifierOutput(loss=tensor(0.6016), logits=tensor([[0.3281, 0.1357]], dtype=torch.bfloat16), hidden_states=None, attentions=None)
pass
```
X-link: https://github.com/pytorch/pytorch/pull/129728
Approved by: https://github.com/jgong5, https://github.com/jansel
Reviewed By: kit1980
Differential Revision: D59357429
fbshipit-source-id: c4af446fc0cc105c7c1322119f5b24dca610588e