transformers
Refactor the way we handle outputs for new llamas and new models
#39120

Merged

Refactor the way we handle outputs for new llamas and new models #39120

ydshieh merged 124 commits into main from clean-llamas

just update 2 files

7433c443

update other models as well just making fix-copies

37b4ef02

also add the changes needed to modeling utils

7f113b43

put this on the pretrained model instead

abf9d39d

nits and fixes

eb6747bc

update generic, fix to use config value

0f1d7e0a

update other modelings

e437edd7

use transformers kwargs instead

96aabd77

update

63df15bb

update

98f402cd

update other models

a7e0ce23

update

c9bb39ef

updates

cb5da530

update

0dc08262

update

fca73ad7

update

98739ba4

fix

124cd829

finally

4a14287a

very small nits

ea87eb70

this fixes more tests

8c66f4d0

fix other models as well!

3caf7d76

ArthurZucker marked this pull request as ready for review 357 days ago

LysandreJik approved these changes on 2025-06-30

update modularqwen2

113219be

update models based on qwen2

e7705c98

update

a74974d9

update

3fb6b710

remove the **flash stuff in favor of noraml kwargs

7266aafa

vasqu approved these changes on 2025-06-30

update

c7d195fe

propagate gemma?

e63ef640

remove output attentions

1303470a

propagate

063e510d

ArthurZucker commented on 2025-06-30

Merge branch 'main' of github.com:huggingface/transformers into clean…

8c96926f

support cross attention edge case

01d4da85

same

780141ca

test this

3c0c56b8

fixes

7a0512a1

more fix

a13a98c6

update

15a8ff4f

update

22423738

update

2748b993

fix conflicts

da50ccc5

update

209d5022

fix emu3

10fb88ae

fix emu3

00afce98

move the fix a bit

3ac6c52f

quel enfer

0b119ffb

some fixes, loss_kwargs should never had been

f7a1f0da

finish fixing gemma3n

6a132a07

fix small lm3

9fa5f266

fix another one

aaae861f

fix csm now

5e5ae84a

fux csm and mistral

075bd0c2

fix mistral now

d04c2b1a

small fixes

5065b9a2

fix janusss

6a5f410d

only for some models

4834aeca

fixup

d8ee27e4

phix phi3

e2973440

more fixes?

0c9f6de0

dose this fix it?

501aead2

update

253307a3

holy shit it was just graph breaks

a267d8d4

protect torch

17cf5424

updates

c4d43c53

fix samhq?

4fc83fa3

fix moonshine

499ae87e

more moonshine fixes, 3 failures left!

b3c8641f

nits

b81df9bd

generic needs to support more

cfe62b6b

more fixes to moonshine!

6eb5e53e

fix cross attention outputs!

a9690f43

fix csm!

d462a8ea

nits

0f3c3683

fix stupid kosmos2

3cba8ac3

current updates

5af5bccd

fixes

9968c85e

use output recorder?

fbfaf040

nicer!

1f559c67

a little bit of magic

cd63172c

update

cf2e98c9

fix protect

c278e1cb

fix

e3c82cb7

small fixes

c5592be0

protect import

f6190cbf

fix a bunch of more models

d0be3319

fix fixups

22f0eaea

fix some of the last ones

422122d6

nit

feba9a03

partly fix phi

9a3708ae

update

7a0f14a7

fix import path

c4f314b3

Merge branch 'main' of github.com:huggingface/transformers into clean…

c6c5efbe

make something that is fullgraph compatible just to be sure

5f3722cf

typing was wrong on llama so the rest was wrong as well

7781368d

fucking ugly but at least it is still exportable

c9493081

syle

eaa7392b

supposed to fix moonshine, it still breaks

4b6a535c

fix some default

9976ed8b

fix the last bits of sam

6d723988

update samhq

ddea6837

Merge branch 'main' of github.com:huggingface/transformers into clean…

f021967a

more fixes to am hq

2e296b55

nit

8aaa10e2

fix all output+hidden states and output_attentions!

b8d6666d

fix?

cb16ef8c

fix diffllama

faf2a427

updates to fix initialization on the sam pips

6c83dcc7

ups there was a bug

bd567297

fix the last sam hq test

4213b183

fix gotocr

df766048

fix gotocr2!

a50382b3

fixes

73d74500

skip stupid tests

59ba6fab

there was one left :)

e9a3e47f

fixup

141a01ff

fix fix copies issues with this test file

cb7a8815

fix copies for sam_hq

90e36aae

rm some comments

459062f0

skip 2 more failing tests

2614116e

fix

f5695c00

fix everything

da4875af

molbap approved these changes on 2025-07-04

Apply suggestions from code review

44da8484

add more doc!

c209f7ed

fix public init

4ae2049d

fix modular qwen3

7548ec23

qubvel commented on 2025-07-04

ydshieh merged ca7e1a37 into main 352 days ago

ydshieh deleted the clean-llamas branch 352 days ago

Reviewers

molbap

vasqu

LysandreJik

qubvel

Assignees

No one assigned

Labels

None yet

Milestone

No milestone

transformers Refactor the way we handle outputs for new llamas and new models #39120 Merged

Refactor the way we handle outputs for new llamas and new models #39120

transformers
Refactor the way we handle outputs for new llamas and new models
#39120

Merged