xla
90d87741 - Set --xla_latency_hiding_scheduler_rerun to 1

Commit

2 years ago

Set --xla_latency_hiding_scheduler_rerun to 1 Summary: This flag will rerun the latency hidding scheduler if the default shared memory limit 95% leads to OOM. Each rerun will choose a value 0.9x of the previous run, and the number of rerun is set to 1 now. Shared memory limit refers to --xla_tpu_scheduler_percent_shared_memory_limit. Lower shared memory limit means less communiation and computation overlapping, and thus worse performance. Test Plan: Tested on Llama 2 7B on V4-32.

References

#5736 - Set --xla_latency_hiding_scheduler_rerun to 1

Author

alanwaketan

Parents

4baef3c6

xla 90d87741 - Set --xla_latency_hiding_scheduler_rerun to 1

xla
90d87741 - Set --xla_latency_hiding_scheduler_rerun to 1