auto-round
84e9a776 - Reduce peak gpu memory usage and support moe estimation (#981)

Commit
76 days ago
Reduce peak gpu memory usage and support moe estimation (#981) - Reduce peak memory usage by calling clear_memory cosidering performance effort. - Move best_params to CPU and make sure clear memory before moving back. - move loss device to the second card if card_0_in_high_risk - support Deepseek R1 W4A16 tuning with 3 CUDA cards (80GB) (--enable_torch_compile) - support llama3.3 70B W4A16 tuning with 2 Intel GPU cards (24GB)(--enable_torch_compile)
Author
Parents
Loading