text-generation-inference
c0f201c9 - Factor out sharding of packed tensors

Commit
1 year ago
Factor out sharding of packed tensors For Phi-3-Small I need to shard a packed QKV bias tensor, for which I implemented the `Weights.get_packed_sharded` method. However, this method can also replace the `Weights._get_qweight` method and the custom sharding code from `Weights.get_weights_col_packed`.
Author
Parents
Loading