[Mosaic GPU] Respect `packing` for TMEM scratch allocation.
This change ensures that the `packing` parameter is correctly propagated to the `tmem_alloc` dialect operation and used in the calculation of the number of columns for Warpgroup lowering semantics.
PiperOrigin-RevId: 796316623