[lit] Refactor available `ptxas` features (#154439)
ToT `lit` currently assumes that a given `ptxas` version supports all
capabilities of prior `ptxas` releases. This approach was flexible
enough to support the removal of 32-bit address compilation from `ptxas`
in CUDA 12.1, but it struggles with the removal of Volta and prior
compilation in CUDA 13.0.
To deal with this, this PR refactors how `lit` defines the set of
features available for a given `ptxas` version. It invokes `ptxas` not
just to get its version, but also to get the list of supported SMs,
supported PTX ISA versions, and support for 32-bit compilation.
This approach should be flexible enough to deal with the changing
support matrix of `ptxas` as it goes forward. One obvious downside is
that this relies on parsing the `stdout` of `ptxas`, something that's
inherently unstable. But, IMO, this is something that we can fix as
needed.