Metrics comparator (#1402)
* some initial work
* skeleton of metrics compare utils and test
* some work done, temp remove line in init
* first working version with test
* few cleanups
* parse percentiles and add call to real XLA metrics
* remove comments
* allow custom expressions and more ops for e2e test
* use local_vars in eval