xla
90d87741 - Set --xla_latency_hiding_scheduler_rerun to 1

Commit
2 years ago
Set --xla_latency_hiding_scheduler_rerun to 1 Summary: This flag will rerun the latency hidding scheduler if the default shared memory limit 95% leads to OOM. Each rerun will choose a value 0.9x of the previous run, and the number of rerun is set to 1 now. Shared memory limit refers to --xla_tpu_scheduler_percent_shared_memory_limit. Lower shared memory limit means less communiation and computation overlapping, and thus worse performance. Test Plan: Tested on Llama 2 7B on V4-32.
Author
Parents
Loading