auto-round
84e9a776 - Reduce peak gpu memory usage and support moe estimation (#981)

Commit

109 days ago

Reduce peak gpu memory usage and support moe estimation (#981) - Reduce peak memory usage by calling clear_memory cosidering performance effort. - Move best_params to CPU and make sure clear memory before moving back. - move loss device to the second card if card_0_in_high_risk - support Deepseek R1 W4A16 tuning with 3 CUDA cards (80GB) （--enable_torch_compile) - support llama3.3 70B W4A16 tuning with 2 Intel GPU cards (24GB)（--enable_torch_compile)

References

#981 - Reduce peak gpu memory usage and support moe estimation

Author

xin3he

Parents

284eecdd

auto-round 84e9a776 - Reduce peak gpu memory usage and support moe estimation (#981)

auto-round
84e9a776 - Reduce peak gpu memory usage and support moe estimation (#981)