[Dper3] Implementation of squeezed input to DC++
Summary:
This Diff provides an option for DC++ module to use the squeezed sparse feature embeddings to generate attention weights, with the purpose of reducing the network size to achieve QPS gains. There are 3 squeeze options: sum, max, and mean, along the embedding dimension and are provided for both the attention weights and resnet generation.
Example workflow: f208474456
{F257199459}
Test Plan:
1. Test single ops
buck test dper3/dper3/modules/low_level_modules/tests:single_operators_test -- test_reduce_back_mean
buck test dper3/dper3/modules/low_level_modules/tests:single_operators_test -- test_reduce_back_max
2. Test DC++ module
buck test dper3/dper3/modules/tests:core_modules_test -- test_dc_pp_arch_one_layer_compressed_embeddings_only_squeeze_input
buck test dper3/dper3/modules/tests:core_modules_test -- test_dc_pp_arch_shared_input_squeeze_input
buck test dper3/dper3/modules/tests:core_modules_test -- test_dc_pp_input_compress_embeddings_squeeze_input
3. Test Arch
buck test dper3/dper3_models/ads_ranking/model_impl/sparse_nn/tests:sparse_nn_lib_test -- test_dense_sparse_interaction_compress_dot_arch_dot_compress_pp_squeezed_input
4. e2e test
buck test dper3/dper3_models/ads_ranking/tests:model_paradigm_e2e_tests -- test_sparse_nn_compress_dot_attention_fm_max_fc_size_squeeze_input
Reviewed By: taiqing
Differential Revision: D22825069
fbshipit-source-id: 29269ea22cb47d487a1c92a1f6daae1055f54cfc