PR #36658 Gemma3 - SemanticDiff

Fix converter

5faeae3b

[Broken] Adds Gemma 3 to Hugging Face Transformers

b21634b1

Consolidating Config and Processor params across impls

8e27cd7c

Sorting out configuration parameters. Adds qk_norm before RoPE. Still…

72b60b03

Additional plumbing for CausalLM and ConditionalGeneration variants

e01056e6

incomplete draft of Orbax conversion script

9a084507

More complete checkpoint conversion

a7d7fb2c

Supporting Gemma 3 1B checkpoints

6d5b6372

Updating RoPE for multiple frequencies

2699ec62

Adjustments to rotary embedder

49c86587

Proof of life for text-only operation

146822a0

Updating the conversion script to handle multimodal projection weights

74f4acbb

Fixing tet-only conversions

bfcc3039

Cleaner conversion script with multimodal support and a simpler proce…

88897b29

Additional refatcors to the Gemma3Processor

0548c26a

Simplified Processor to work over text representations

1a860c71

Updated conversion script to join text and vision embeddings at conve…

f9036cd2

Logging for debugging

61f0b582

Update src/transformers/models/gemma2/modeling_gemma2.py

3f282c94

Removed extraneous Config params

8b41347b

Switching to fast tokenizer for checkpoint conversions

daacc1d3

isolating siglip for performance tetsing

4338957c

Minor changes for debugging tests against baselines

14c443cd

Adding average pooling for soft tokens

d45be318

Updating processor code to enable simpler embedding interleaving for …

cdbd03f3

Updating conversion script for ShieldGemma 2 conversion compatibility

ec2a7df8

Allow disable_compile to be provided as a kwarg

85d11816

Refresh from modular

6922438e

Updated conversion script and corrected sliding window

f47afe2a

Fix type mismatch in cache_position (#4)

c40f6e21

Fix dtype (#5)

5ebdcb82

fixes for embedding table overflow and missing image_soft_token_mask …

432c645e

Adding 2D pooling for image embeddings

65350cf5

Revert "Adding 2D pooling for image embeddings"

00af9a72

Gemma3 average pooling changed from 1D to 2D

1a361871

Merge pull request #8 from RyanMullins/gemma3pooling

88030d16

Major refactor to Gemma3MultimodalInputProjection

e23b2ba6

Updating Gemm 3 Auto* registrations

6670e1b5

Add option to save Gemma 3 chat template with tokenizer during weight…

7907bf07

Removing unused imports

6d0dd5a6

Moving out-of-vocab handling from Gemma3Processor to Gemma3ForConditi…

c042cd08

Removing duplicate config property

10a61859

Removing final logit softcapping and 1-indexing of position ids

21fc6827

Fixing image processor config and none --> None typo

fa28a8ca

Fixing sliding window size for 1B

0f148d1c

Updating image_mean and image_std in Image Processor

48bca47e

Attention masking changed to lower triangular

576f065c

Merge pull request #9 from RyanMullins/gemma3attention

f137065c

Moving image special tokens to conversion script

e9e41bb1

Mirror image processor defaults from conversion script into Gemma3Pro…

ed3813d3

Remove special token variables from symbol space

f25309c2

Moving image soft token mask computation from Gemma3Processor to Gemm…

2ad61ba7

tie lm_head and embedding weights

a45b01cb

Correct tied weights in Gemma3CausalLM

dae9277e

iterative bidirectional attention

c5f84468

resolving merge conflicts

c7c8468f

Reverting to Gemma 2 HybridCache with sldiing window support and a sl…

a7cb4af1

Correcting RoPE scaling

9bb66a27

clean up first pass, dummy model geenration works

bd5b5e5b

final clean up before fixing tests

e1d448c0

causal lm test works, so fine

4b9e8b45

Fix conversion

ee837ca2

Update src/transformers/models/gemma3/processing_gemma3.py

875c1047

Merge remote-tracking branch 'origin/gemma3' into gemma3-convert

536d5b8c

model tests are happy

ae6f71db

processor tests are happy

de52bb59

image processing tests added

d0e0b00e

fixup

240c6958

Fix pre-processing in conversion

42693328

Inputs merging

b89faaf2

Do not normalize vision embeddings

21f15c17

Apply Ryan's (and team) changes to attention

abde03a8

token type ids + mask

613ccb3d

Merge branch 'gemma3-convert' into gemma3

0c5f50cd

template

f6f07d78

Merge remote-tracking branch 'upstream/main' into gemma3

50e17993

move embed scale, add rope scale, fix tests

0d914582

Add chat template to tokenizer

f19907c5

Use prefix for causal model loading

daf6feac

use existing code for sliding mask from gemma2

b03ef675

Merge remote-tracking branch 'origin/gemma3' into multimodals-are-causal

402c7af6

self.embed_tokens already normalizes

d36921d8

Correcting Gemma3TextConfig parameters in conversion script

b089958a

typo, modular overwrites my fixes

54ebbb74

Merge branch 'gemma3' into multimodals-are-causal

50492bac

enable device map for text model

a99de0c5

Conversion updates

f71762f6

Merge pull request #7 from huggingface/multimodals-are-causal

e2c50bcb

ultra nit: no einsums

e9f46fd0

update image token

42b7a0a7

copy deepcopy config + some docs

d542591b

add some test, still WIP

faecbac7

Refactoring --include_chat_tempalte logic in converter

de4ae310

Update src/transformers/models/gemma3/modular_gemma3.py

03ea3327

Add eos tokens for instruct models

6ed3b7dd

Merge pull request #8 from huggingface/convert-with-eos

d9b65411

dump so i can work on dgx

a4078295

Removing add_bos by default

1436ae84

dump

69f3748e

add fast im proc

fbd8a270

docs for PaS + fixup

af8081bf

another fixup

21904843

one more fixup

49524d2f

fix tests

1c57c1ed

Inverting prior BOS change

8ab84bb4

ultra nit

6dd1aef1

Reverting to Tokenizer saved with add_bos_token=True and chat templat…

ae806859

resize embeds, remove sqrt, add slow test outputs

ba77bc56

FA2 but quality is meh

aa9d1410

Merge pull request #9 from huggingface/raushan-working

35ff071c

nit

ca82ebc7

skip FA2, no idea what happened

74da7218

last bit for green CI

123402a2

please, green CI for docs

d541fe4d

T_T

12807145

Fix for Gemma3 logits

49141331

Support both options for system prompt

4c48f139

Add support for both forms of system prompts

17119428

Update src/transformers/models/gemma3/image_processing_gemma3_fast.py

5ad5b27e