[nnc] Use int64 to compute matmul flops heuristic (#58676)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/58676
We only generate asm for small matmuls, but we were computing the # of
flops using an int32, which is too small.
Test Plan:
```
buck test mode/dev //caffe2/test:static_runtime -- --exact 'caffe2/test:static_runtime - test_mlp (test_static_runtime.TestStaticModule)'
```
Reviewed By: navahgar
Differential Revision: D28562157
fbshipit-source-id: a07ceba5209ef6022ead09140380c116994755cf