transformers
RoPE loses precision for Llama / Gemma + Gemma logits.float()
#29285
Merged

RoPE loses precision for Llama / Gemma + Gemma logits.float() #29285

ArthurZucker merged 20 commits into huggingface:main from unslothai:main
danielhanchen
danielhanchen Update modeling_llama.py
7a257201
danielhanchen Update modeling_llama.py
db8237f4
danielhanchen Update modeling_gemma.py
3de95c42
danielhanchen
ArthurZucker
ArthurZucker commented on 2024-02-26
danielhanchen
ArthurZucker
ArthurZucker commented on 2024-02-27
ArthurZucker
ArthurZucker commented on 2024-02-27
danielhanchen Merge branch 'huggingface:main' into main
9e5cbb06
danielhanchen @torch.no_grad()
99d564e7
danielhanchen @torch.no_grad()
d0c08bf6
ArthurZucker
ArthurZucker approved these changes on 2024-02-27
HuggingFaceDocBuilderDev
fxmarty
fxmarty commented on 2024-02-27
gante
danielhanchen
danielhanchen Merge branch 'huggingface:main' into main
bd3a2142
danielhanchen Cos, Sin to float32
abffebb6
danielhanchen cos, sin to float32
c2e31bf4
danielhanchen
ArthurZucker
ArthurZucker approved these changes on 2024-02-28
danielhanchen Update src/transformers/models/gemma/modeling_gemma.py
f487800f
danielhanchen Update src/transformers/models/llama/modeling_llama.py
c8526756
danielhanchen Resolve PR conflicts
1a50a4bc
danielhanchen Fix RoPE for llama
b860a22d
danielhanchen Revert "Fix RoPE for llama"
790e4a3a
danielhanchen Merge remote-tracking branch 'upstream/main'
06c76346
danielhanchen Fix RoPE for llama
aa03a433
gante
gante approved these changes on 2024-02-28
danielhanchen
gante
danielhanchen RoPE device
5730a503
ArthurZucker
ArthurZucker commented on 2024-02-28
fxmarty
danielhanchen Autocast device type
31cea3b3
ArthurZucker
ArthurZucker
ArthurZucker commented on 2024-02-28
ArthurZucker
danielhanchen RoPE
ae9957f3
danielhanchen RoPE isinstance
ec9ef17f
ArthurZucker
ArthurZucker ArthurZucker merged d3a4b475 into main 1 year ago
danielhanchen
suryabhupa
danielhanchen
paulcx
danielhanchen
ArthurZucker
paulcx
ArthurZucker
paulcx
danielhanchen
paulcx
danielhanchen
paulcx
gante
kwen2501
gante
lessw2020
lessw2020
danielhanchen
kwen2501
lessw2020
ArthurZucker
gante
lessw2020
ArthurZucker

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone