pytorch
866227cf - [pt][quant] Add vector path to copy kernel for quantized data types (#36189)

Commit View On GitHub

Commit

4 years ago

[pt][quant] Add vector path to copy kernel for quantized data types (#36189) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/36189 We only had a scalar path for the copy kernel for quantized data types. This diff adds a vector path. It should improve all the ops where copy is used. This results in 10x better performance for mul_scalar in one of the benchmarked models. ### Before: ``` ------------------------- --------------- --------------- --------------- --------------- --------------- --------------- Name Self CPU total % Self CPU total CPU total % CPU total CPU time avg Number of Calls ------------------------- --------------- --------------- --------------- --------------- --------------- --------------- quantize_per_tensor 0.16% 171.287us 0.16% 171.287us 171.287us 1 quantized::conv2d 56.65% 58.830ms 56.65% 58.830ms 387.040us 152 quantized::add_scalar 6.02% 6.256ms 6.02% 6.256ms 67.270us 93 quantized::relu6 2.04% 2.121ms 2.04% 2.121ms 22.808us 93 quantized::mul_scalar 19.33% 20.076ms 19.33% 20.076ms 215.876us 93 quantized::mul 13.79% 14.320ms 13.79% 14.320ms 124.520us 115 quantized::add 1.17% 1.215ms 1.17% 1.215ms 43.388us 28 adaptive_avg_pool2d 0.04% 41.684us 0.64% 661.083us 28.743us 23 _adaptive_avg_pool2d 0.60% 619.399us 0.60% 619.399us 26.930us 23 sigmoid 0.17% 180.745us 0.17% 180.745us 8.216us 22 dropout 0.00% 1.798us 0.00% 1.798us 1.798us 1 view 0.01% 8.529us 0.01% 8.529us 8.529us 1 dequantize 0.01% 7.481us 0.01% 7.481us 7.481us 1 ------------------------- --------------- --------------- --------------- --------------- --------------- --------------- Self CPU time total: 103.849ms ``` ### After: ``` ------------------------- --------------- --------------- --------------- --------------- --------------- --------------- Name Self CPU total % Self CPU total CPU total % CPU total CPU time avg Number of Calls ------------------------- --------------- --------------- --------------- --------------- --------------- --------------- quantize_per_tensor 0.23% 193.581us 0.23% 193.581us 193.581us 1 quantized::conv2d 68.66% 58.702ms 68.66% 58.702ms 386.197us 152 quantized::add_scalar 7.11% 6.082ms 7.11% 6.082ms 65.401us 93 quantized::relu6 2.40% 2.056ms 2.40% 2.056ms 22.104us 93 quantized::mul_scalar 2.34% 2.001ms 2.34% 2.001ms 21.513us 93 quantized::mul 16.85% 14.410ms 16.85% 14.410ms 125.308us 115 quantized::add 1.34% 1.149ms 1.34% 1.149ms 41.033us 28 adaptive_avg_pool2d 0.05% 46.415us 0.78% 667.620us 29.027us 23 _adaptive_avg_pool2d 0.73% 621.205us 0.73% 621.205us 27.009us 23 sigmoid 0.25% 215.650us 0.25% 215.650us 9.802us 22 dropout 0.00% 2.503us 0.00% 2.503us 2.503us 1 view 0.01% 11.608us 0.01% 11.608us 11.608us 1 dequantize 0.01% 9.221us 0.01% 9.221us 9.221us 1 ------------------------- --------------- --------------- --------------- --------------- --------------- --------------- Self CPU time total: 85.500ms ``` Test Plan: buck test //caffe2/test:quantization -- 'test_qtensor_copy' --print-passing-details Reviewed By: jspark1105 Differential Revision: D20906956 fbshipit-source-id: d538b8dc0d031ce61cb1b0af14a1c012976d75b1

Author

dskhudia

Committer

facebook-github-bot

Parents

1443db8d

pytorch 866227cf - [pt][quant] Add vector path to copy kernel for quantized data types (#36189)

Commit

pytorch
866227cf - [pt][quant] Add vector path to copy kernel for quantized data types (#36189)