transformers
226b0e46 - Add a use_parallel_residual argument to control the residual computing way (#18695)

Commit
3 years ago
Add a use_parallel_residual argument to control the residual computing way (#18695) * Add a gpt_j_residual argument to control the residual computing way * Put duplicate code outside of the if block * Rename parameter "gpt_j_residual" to "use_parallel_residual" and set the default value to True
Author
Parents
Loading