[`StableLm`] Add QK normalization and Parallel Residual Support #29745
init: add StableLm 2 support
3e0509bb
add integration test for parallel residual and qk layernorm
70662b93
update(modeling): match qk norm naming for consistency with phi/persi…
ea486cf4
fix(tests): run fwd/bwd on random init test model to jitter norm weig…
b53747f2
`use_parallel_residual`: add copy pointer to `GPTNeoXLayer.forward`
7d912dd0
refactor: rename head states var in `StableLmLayerNormPerHead`
593986a7
tests: update test model and add generate check
72a01f54
jon-tow
deleted the add-stablelm-2-12b branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub