make quantizeable MHA work with torch.jit.script (#57774)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57774
Makes `torch.nn.quantizable.MultiheadAttention`
work with `torch.jit.script`.
Test Plan:
```
python test/test_quantization.py TestQuantizedOps.test_custom_module_multi_head_attention
```
Imported from OSS
Reviewed By: jerryzh168
Differential Revision: D28268218
fbshipit-source-id: 422868d9d26cae015d3c691ea710d82ffac3fa7f