SemanticDiff

pytorch
70830b5a - [QNNPACK, Sparsity] Sparse kernel with 4x8 blocking (#50590)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

3 years ago

[QNNPACK, Sparsity] Sparse kernel with 4x8 blocking (#50590) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50590 Larger blocking across M dim such as 8 in previous PR is likely introducing wasted compute on the shapes being benchmarked. Here we introduced 4x8 blocking of mrxnr. This helps 1) in packing smaller data for small values of M and 2) for compute kernel it writes same number of bytes but more contiguously. It is not certain but it likely helps. Test Plan: q8gemm-sparse-test fully-connected-sparse-test Imported from OSS Reviewed By: AshkanAliabadi Differential Revision: D25925499 fbshipit-source-id: 01c661ceea38bd6ee8321bb85cf1d5da5de4e984

Author

kimishpatel

kimishpatel

Committer

facebook-github-bot

facebook-github-bot

Parents

FAQ Terms Privacy Refunds Impressum

Loading