Change "retries" for "failed_runs" in cache collection (#2247)
* increase retry only for completed or failed cache records
* adding one more test
* fix style
* rename retries to failed_runs
* move increment to orchestrator
* fix code review observation