[flang][cuda] Fix fir.cuda_kernel_launch assembly with no args (#85987)
When the kernel launch has no arguments, the generated parser was
expecting at least a type to be present. Make the last part of the
assemble format optional.
Add a run line to round-trip the output through fir-opt so we make sure
the IR can be parsed and printed correctly.