DeepSpeed
support autoTP with weight only quantization in DS inference path
#4750
Open

Loading