inductor: fix FloorDiv issue for dynamic shape path (#101793)
For TIMM ```tf_mixnet_l``` cpu dynamic shape path, we always get a wrong result compared with eager mode, the root cause is that we compute a wrong index when doing vectorization:
```
or(long i2=static_cast<long>(0L); i2<static_cast<long>(16L*(((std::ceil((1.0/2.0)*(std::ceil((1.0/2.0)*(std::ceil((1.0/2.0)*(std::ceil((1.0/2.0)*ks1))))))))*(std::ceil((1.0/2.0)*(std::ceil((1.0/2.0)*(std::ceil((1.0/2.0)*(std::ceil((1.0/2.0)*ks1))))))))) / 16L)); i2+=static_cast<long>(16L))
```
the main loop's index using ```/``` rather than ```//```. After this PR, the ```tf_mixnet_l``` accuracy test can be passed.
How to reproduce this issue?
```
python -m torch.backends.xeon.run_cpu --node_id 0 benchmarks/dynamo/timm_models.py --accuracy --float32 -dcpu --inference -n5 --inductor --dynamic-shapes --only tf_mixnet_l
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/101793
Approved by: https://github.com/jgong5, https://github.com/EikanWang, https://github.com/ezyang