Use the Python jit for the compilation in the C++ jit.
Using xla_computation has been a doomed attempt, because it does not support all the features, and cannot deal with, in particular, nested tracing.
Thus, we directly use the current path, and use a thread local value to access the last compiled objects from C++ (it allows to not touch the Python tracing logic).
This also:
- Delay the access of jax_enable_64 to after GoogleInit.
PiperOrigin-RevId: 335910130