transformers
76c0bc06 - [XLNet] Changed post-processing of attention w.r.t to target_mapping

Commit
6 years ago
[XLNet] Changed post-processing of attention w.r.t to target_mapping Whenever target_mapping is provided to the input, XLNet outputs two different attention streams. Based on that the attention output would be on of the two: - a list of tensors (usual case for most transformers) - a list of 2-tuples of tensors, one tesor for each of attention streams Docs and unit-tests have been updated
Author
Parents
Loading