Skip bandwidth bound mms (#102199)
Speeds up compilation time, and was particularly needed for cm3leon_generate which has a ton of small matmuls of different sizes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/102199
Approved by: https://github.com/ngimel, https://github.com/jansel