Align layernorm dtype handling with batchnorm (i.e., use requested dtype for layernorm outputs, even though intermediate computations are f32). #331
Align layernorm dtype handling with batchnorm (i.e., use requested dt…
ed42d067
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub