Fix converter
5faeae3b
[Broken] Adds Gemma 3 to Hugging Face Transformers
b21634b1
Consolidating Config and Processor params across impls
8e27cd7c
Sorting out configuration parameters. Adds qk_norm before RoPE. Still…
72b60b03
Additional plumbing for CausalLM and ConditionalGeneration variants
e01056e6
incomplete draft of Orbax conversion script
9a084507
More complete checkpoint conversion
a7d7fb2c
Supporting Gemma 3 1B checkpoints
6d5b6372
Updating RoPE for multiple frequencies
2699ec62
Adjustments to rotary embedder
49c86587
Proof of life for text-only operation
146822a0
Updating the conversion script to handle multimodal projection weights
74f4acbb
Fixing tet-only conversions
bfcc3039
Cleaner conversion script with multimodal support and a simpler proce…
88897b29
Additional refatcors to the Gemma3Processor
0548c26a
Simplified Processor to work over text representations
1a860c71
Updated conversion script to join text and vision embeddings at conve…
f9036cd2
Logging for debugging
61f0b582
Update src/transformers/models/gemma2/modeling_gemma2.py
3f282c94
Removed extraneous Config params
8b41347b
Switching to fast tokenizer for checkpoint conversions
daacc1d3
isolating siglip for performance tetsing
4338957c
Minor changes for debugging tests against baselines
14c443cd
Adding average pooling for soft tokens
d45be318
Updating processor code to enable simpler embedding interleaving for …
cdbd03f3
Updating conversion script for ShieldGemma 2 conversion compatibility
ec2a7df8
Allow disable_compile to be provided as a kwarg
85d11816
Refresh from modular
6922438e
Updated conversion script and corrected sliding window
f47afe2a
Fix type mismatch in cache_position (#4)
c40f6e21
Fix dtype (#5)
5ebdcb82
fixes for embedding table overflow and missing image_soft_token_mask …
432c645e
Adding 2D pooling for image embeddings
65350cf5
Revert "Adding 2D pooling for image embeddings"
00af9a72
Gemma3 average pooling changed from 1D to 2D
1a361871
Merge pull request #8 from RyanMullins/gemma3pooling
88030d16
Major refactor to Gemma3MultimodalInputProjection
e23b2ba6
Updating Gemm 3 Auto* registrations
6670e1b5
Add option to save Gemma 3 chat template with tokenizer during weight…
7907bf07
Removing unused imports
6d0dd5a6
Moving out-of-vocab handling from Gemma3Processor to Gemma3ForConditi…
c042cd08
Removing duplicate config property
10a61859
Removing final logit softcapping and 1-indexing of position ids
21fc6827
Fixing image processor config and none --> None typo
fa28a8ca
Fixing sliding window size for 1B
0f148d1c
Updating image_mean and image_std in Image Processor
48bca47e
Attention masking changed to lower triangular
576f065c
Merge pull request #9 from RyanMullins/gemma3attention
f137065c
Moving image special tokens to conversion script
e9e41bb1
Mirror image processor defaults from conversion script into Gemma3Pro…
ed3813d3
Remove special token variables from symbol space
f25309c2
Moving image soft token mask computation from Gemma3Processor to Gemm…
2ad61ba7
tie lm_head and embedding weights
a45b01cb
Correct tied weights in Gemma3CausalLM
dae9277e
iterative bidirectional attention
c5f84468
resolving merge conflicts
c7c8468f
Reverting to Gemma 2 HybridCache with sldiing window support and a sl…
a7cb4af1
Correcting RoPE scaling
9bb66a27
clean up first pass, dummy model geenration works
bd5b5e5b
final clean up before fixing tests
e1d448c0
causal lm test works, so fine
4b9e8b45
Fix conversion
ee837ca2
Update src/transformers/models/gemma3/processing_gemma3.py
875c1047
Merge remote-tracking branch 'origin/gemma3' into gemma3-convert
536d5b8c
model tests are happy
ae6f71db
processor tests are happy
de52bb59
image processing tests added
d0e0b00e
fixup
240c6958
Fix pre-processing in conversion
42693328
Inputs merging
b89faaf2
Do not normalize vision embeddings
21f15c17
Apply Ryan's (and team) changes to attention
abde03a8
token type ids + mask
613ccb3d
Merge branch 'gemma3-convert' into gemma3
0c5f50cd
template
f6f07d78
Merge remote-tracking branch 'upstream/main' into gemma3
50e17993
move embed scale, add rope scale, fix tests
0d914582
Add chat template to tokenizer
f19907c5
Use prefix for causal model loading
daf6feac
use existing code for sliding mask from gemma2
b03ef675
Merge remote-tracking branch 'origin/gemma3' into multimodals-are-causal
402c7af6
self.embed_tokens already normalizes
d36921d8
Correcting Gemma3TextConfig parameters in conversion script
b089958a
typo, modular overwrites my fixes
54ebbb74
Merge branch 'gemma3' into multimodals-are-causal
50492bac
enable device map for text model
a99de0c5
Conversion updates
f71762f6
Merge pull request #7 from huggingface/multimodals-are-causal
e2c50bcb
ultra nit: no einsums
e9f46fd0
update image token
42b7a0a7
copy deepcopy config + some docs
d542591b
add some test, still WIP
faecbac7
Refactoring --include_chat_tempalte logic in converter
de4ae310
Update src/transformers/models/gemma3/modular_gemma3.py
03ea3327
Add eos tokens for instruct models
6ed3b7dd
Merge pull request #8 from huggingface/convert-with-eos
d9b65411
dump so i can work on dgx
a4078295
Removing add_bos by default
1436ae84
dump
69f3748e
add fast im proc
fbd8a270
docs for PaS + fixup
af8081bf
another fixup
21904843
one more fixup
49524d2f
fix tests
1c57c1ed
Inverting prior BOS change
8ab84bb4
ultra nit
6dd1aef1
Reverting to Tokenizer saved with add_bos_token=True and chat templat…
ae806859
resize embeds, remove sqrt, add slow test outputs
ba77bc56
FA2 but quality is meh
aa9d1410
Merge pull request #9 from huggingface/raushan-working
35ff071c
nit
ca82ebc7
skip FA2, no idea what happened
74da7218
last bit for green CI
123402a2
please, green CI for docs
d541fe4d
T_T
12807145
Fix for Gemma3 logits
49141331
Support both options for system prompt
4c48f139
Add support for both forms of system prompts
17119428
Update src/transformers/models/gemma3/image_processing_gemma3_fast.py
5ad5b27e
Update docs/source/en/model_doc/gemma3.md
c3b02132
Update docs/source/en/model_doc/gemma3.md
2dd948b0
Update docs/source/en/model_doc/gemma3.md
5f8f8a6c
Update docs/source/en/model_doc/gemma3.md
cd14f3fd
Update docs/source/en/model_doc/gemma3.md
782bb92f
Docs updates now that assets are live
a3341214
LysandreJik
marked this pull request as ready for review 280 days ago
Style fixes
95435e91
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub