Improve PyTorch profiler flop computation formulas (#51377)
Summary:
Improve the flops computation formula of aten::conv2d operator to support stride, pad, dilation, and groups arguments.
This diff also fixes the following issues:
- Apply a factor of 2 to aten::mm because output accounts for multiplication and addition.
- Fix incorrect names of scalar operators to aten::mul and aten::add.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51377
Test Plan:
```python
python test/test_profiler.py
```
Reviewed By: jspark1105
Differential Revision: D26165223
Pulled By: xuzhao9
fbshipit-source-id: 2c5f0155c47af2e6a19332fd6ed73ace47fa072a