OptimizedLinear implementation (#5355)

Commit

2 years ago

OptimizedLinear implementation (#5355) Optimized version of `nn.Linear` that adds features such as: * LoRA w. base weight sharding * FP [6,8,12] quantization Depends on #5336 being merged first Co-authored-by: @rajhans Co-authored-by: @aurickq --------- Co-authored-by: Rajhans Samdani <rajhans.samdani@snowflake.com> Co-authored-by: Jeff Rasley <jeff.rasley@snowflake.com>