chore(root): Initial commit of Phi-3 files.
c1e38b0b
fix(root): Fixes Phi-3 missing on readme.
416eaa41
fix(root): Ensures files are consistent.
e0b68151
fix(phi3): Fixes unit tests.
912edf15
fix(tests): Fixes style of phi-3 test file.
b62e6f3e
chore(tests): Adds integration tests for Phi-3.
508ec8ef
gugarosa
marked this pull request as ready for review 1 year ago
fix(phi3): Removes additional flash-attention usage, .e.g, swiglu and…
56e6464f
fix(phi3): Fixes incorrect docstrings.
9bc1f1f1
fix(phi3): Fixes docstring typos.
92d83790
fix(phi3): Adds support for Su and Yarn embeddings.
c442d064
fix(phi3): Improves according first batch of reviews.
d5aed89b
fix(phi3): Uses up_states instead of y in Phi3MLP.
3a24a1d4
fix(phi3): Uses gemma rotary embedding to support torch.compile.
4cfa767d
fix(phi3): Improves how rotary embedding classes are defined.
817fec7b
fix(phi3): Fixes inv_freq not being re-computed for extended RoPE.
9427419d
Merge remote-tracking branch 'upstream/main' into main
06cd06d2
fix(phi3): Adds last suggestions to modeling file.
2abcd4de
fix(phi3): Splits inv_freq calculation in two lines.
aeb6ae7e
Assignees
No one assigned
Labels
single-model-run-slow
Login to write a write a comment.
Login via GitHub