transformers
Add a use_parallel_residual argument to control the residual computing way
#18695
Merged

Loading