Add step-mark operation, and split sync-tensors into a separate operation.
Added ability to associate user defined metadata to IR nodes, to be able to link back IR nodes to the hosting XLATensor::Data core data structure.
Reorganized the XLA tensors arena in a more general device context arena, to be able to host other information on top of live tensors data.