[flang][runtime] Speed up initialization & destruction (#148087)
Rework derived type initialization in the runtime to just initialize the
first element of any array, and then memcpy it to the others, rather
than exercising the per-component paths for each element.
Reword derived type destruction in the runtime to detect and exploit a
fast path for allocatable components whose types themselves don't need
nested destruction.
Small tweaks were made in hot paths exposed by profiling in descriptor
operations and derived type assignment.