Temporarily remove the attention model and fix pytorch_struct model. (#558)
Summary:
torchtext removes the legacy dataset utilities(https://github.com/pytorch/text/pull/1437), therefore we need to migrate to the new dataset API or keep the old API but copy the related code here. This PR we still use the old API because it seems non-trivial to migrate to the new API.
I will re-add the attention model in a follow-up PR (and do the quality analysis there).
Also, pytorch_struct model runs an [unsupervised learning task](https://github.com/harvardnlp/pytorch-struct/blob/master/notebooks/Unsupervised_CFG.ipynb), therefore, it does not support eval test.
# batch size analysis
<google-sheets-html-origin>
Batch Size | GPU Time | CPU Dispatch Time | Walltime | GPU Delta
-- | -- | -- | -- | --
16 | 53.06 | 52.969 | 53.072 | -
32 | 85.123 | 75.824 | 85.126 | 0.6042781757
64 | 157.218 | 121.155 | 157.23 | 0.8469508828
128 | 315.678 | 242.213 | 315.681 | 1.007899859
256 | 568.102 | 428.017 | 568.098 | 0.7996249343
# Non-idleness analysis (train, bs=128)

GPU is mostly idle when bs=32, so I am testing with bs=128 instead.
Data is already prefetched to the device.
Pull Request resolved: https://github.com/pytorch/benchmark/pull/558
Reviewed By: aaronenyeshi
Differential Revision: D32393368
Pulled By: xuzhao9
fbshipit-source-id: 87df0fb0667181ec1cc1a597beef3c777138d46a