PR #56 Add udop - SemanticDiff

add custom RMSNorm to `ALL_LAYERNORM_LAYERS` (#26227)

e3a4bd2b

Keep relevant weights in fp32 when `model._keep_in_fp32_modules` is s…

da971b22

Fix FSMT weight sharing (#26292)

26ba56cc

update hf hub dependency to be compatible with the new tokenizers (#2…

b132c170

Porting the torchaudio kaldi fbank implementation to audio_utils (#26…

9a307534

More error message fixup, plus some linebreaks! (#26296)

000e52ae

[QUICK FIX LINK] Update trainer.py (#26293)

587b7b16

Use CircleCI `store_test_results` (#26223)

06ee91ae

Fix doctest CI (#26324)

c3ecf2d9

[doc] fixed indices in obj detection example (#26343)

dcbfd93d

[`core` ] Integrate Flash attention 2 in most used models (#25598)

368a58e6

[TTA Pipeline] Fix MusicGen test (#26348)

914771cb

Add image to image pipeline (#25393)

576cd45a

feat: adding num_proc to load_dataset (#26326)

910faa3e

Fixed unclosed p tags (#26240)

5936c8c5

Update add_new_model.md (#26365)

6accd5ef

Fix MusicGen logging error (#26370)

0ee45906

[docs] removed MaskFormerSwin and TimmBackbone from the table on inde…

546e7679

Update tiny model information and pipeline tests (#26285)

d9e4bc28

Add Russian localization for README (#26208)

033ec57c

🌐 [i18n-KO] Translated `audio_classification.mdx` to Korean (#26200)

5e09af2a

Add Nougat (#25942)

ace74d16

[ViTMatte] Add resources (#26317)

a09130fe

Deleted duplicate sentence (#26394)

a8531f3b

added support for gradient checkpointing in ESM models (#26386)

6ce6a5ad

Fix DeepSpeed issue with Idefics (#26393)

0ac38750

[InternLM] Add support for InternLM (#26302)

6ba63ac3

Add torch `RMSProp` optimizer (#26425)

408b2b3c

Fix padding for IDEFICS (#26396)

abd25310

Update semantic_segmentation.md (#26419)

777f2243

Fixing tokenizer when `transformers` is installed without `tokenizers…

a0be960d

[`FA` / `tests`] Add use_cache tests for FA models (#26415)

153755ee

add bf16 mixed precision support for NPU (#26163)

946bac79

[`PEFT`] Fix PEFT multi adapters support (#26407)

3ca18d6d

[Mistral] Mistral-7B-v0.1 support (#26447)

72958fcd

Fix failing doctest (#26450)

78dd1202

Update `runs-on` in workflow files (#26435)

6ae71ec8

[i18n-DE] Complete first toc chapter (#26311)

ef81759e

🌐 [i18n-KO] Translated `debugging.md` to Korean (#26246)

a0922a53

🌐 [i18n-KO] Translated `perf_train_gpu_many.md` to Korean (#26244)

ab37b801

optimize VRAM for calculating pos_bias in LayoutLM v2, v3 (#26139)

a7e0ed82

Fix `cos_sin` device issue in Falcon model (#26448)

375b4e09

docs: change assert to raise and some small docs (#26232)

ba47efbf

change mention of decoder_input_ids to input_ids and same with decode…

098c3f40

[VITS] Fix speaker_embed device mismatch (#26115)

52e2c13d

[`PEFT`] introducing `adapter_kwargs` for loading adapters from diffe…

38e96324

Do not warn about unexpected decoder weights when loading T5EncoderMo…

216dff75

fix_mbart_tied_weights (#26422)

5e11d72d

Esm checkpointing (#26454)

4e931a8e

[Whisper Tokenizer] Make decoding faster after adding timestamps (#26…

211f93aa

[docs] Update offline mode docs (#26478)

7bb1c0c1

[docs] navigation improvement between text gen pipelines and text gen…

14170b78

Skip 2 failing persimmon pipeline tests for now (#26485)

9b23d0de

Avoid all-zeor attnetion mask used in testing (#26469)

39117744

[Flax Examples] Seq2Seq ASR Fine-Tuning Script (#21764)

68e85fc8

[ASR Pipe] Improve docs and error messages (#26476)

0b192de1

Revert falcon exception (#26472)

67239f73

Fix num_heads in _upad_input (#26490)

ca0379b8

Fix requests connection error during modelcard creation (#26518)

7d77d7f7

Fix issue of canine forward requiring input_ids anyway (#26290)

6d02ca4b

Fix broken link to video classification task (#26487)

7d6627d0

[`PEFT`] Pass token when calling `find_adapter_config` (#26488)

24178c24

[`core`/ `auto` ] Fix bnb test with code revision + bug with code re…

6824461f

Fix model integration ci (#26322)

63864e05

[`PEFT`] Protect `adapter_kwargs` check (#26537)

1b8decb0

Remove-warns (#26483)

e4dad4fe

[Doctest] Add configuration_roformer.py (#26530)

4b4c6aab

Code-llama-nit (#26300)

bab33319

add build_inputs_with_special_tokens to LlamaFast (#26297)

c20d90d5

🌐 [i18n-KO] Translated `tokenizer_summary.md` to Korean (#26243)

1470f731

[i18n-DE] contribute chapter (#26481)

9ed538f2

Bump urllib3 from 1.26.5 to 1.26.17 in /examples/research_projects/lx…

e092b4ad

Bump urllib3 from 1.26.5 to 1.26.17 in /examples/research_projects/vi…

6de6fdd0

Bump urllib3 from 1.26.9 to 1.26.17 in /examples/research_projects/de…

cf345d5f

[RFC, Logging] Change warning to info (#26545)

df6a855e

Add tokenizer kwargs to fill mask pipeline. (#26234)

b5ca8fcd

[Wav2Vec2 and Co] Update init tests for PT 2.1 (#26494)

768aa3d9

[AMD] Add initial version for run_tests_multi_gpu (#26346)

3632fb3c

[Doctest] Add `configuration_encoder_decoder.py` (#26519)

245da7ed

Nit-added-tokens (#26538)

1a2e966c

[`Mistral`] Add Flash Attention-2 support for `mistral` (#26464)

ae9a344c

[`PEFT`] Final fixes (#26559)

2aef9a96

[`Nougat`] from transformers import * (#26562)

c26b2a29

v4.35.0.dev0

bd620591

[Whisper] Allow basic text normalization (#26149)

57f44dc4

🌐 [i18n-KO] Translated `semantic_segmentation.md` to Korean (#26515)

2c7b26f5

[Tokenizers] Skip tests temporarily (#26574)

5c66378c

docs: feat: add clip notebook resources from OSSCA community (#26505)

2f3ea08a

Bump pillow from 9.3.0 to 10.0.1 in /examples/research_projects/decis…

fc296f41

Extend Trainer to enable Ascend NPU to use the fused Adamw optimizer …

4fdf47cd

feat: add trainer label to wandb run upon initialization (#26466)

122b2657

Docstring check (#26052)

03af4c42

Add add_generation_prompt argument to apply_chat_template (#26573)

8b46c5bc

refactor: change default block_size (#26229)

6015f91a

[Mistral] Update config docstring (#26593)

0a49f909

Add # Copied from statements to audio feature extractors that use the…

9deb18ca

Fix embarrassing typo in the doc chat template! (#26596)

8b03615b

Fix encoder->decoder typo bug in convert_t5x_checkpoint_to_pytorch.py…

ca7912d1

skip flaky hub tests (#26594)

c037b2e3

Update mistral.md to update 404 link (#26590)

f9ab07f9

[Wav2Vec2] Fix tokenizer set lang (#26349)

2d8ee981

add zh translation for installation (#26084)

43bfd093

[ `NougatProcessor`] Fix the default channel (#26608)

b4e66d7a

[`GPTNeoX`] Faster rotary embedding for GPTNeoX (based on llama chang…

253f9a3f

[Falcon] Set `use_cache=False` before creating `presents` which relie…

2ab76c2c

Fix failing tests on `main` due to torch 2.1 (#26607)

54e17a15

Make `ModelOutput` serializable (#26493)

19f0b7dd

[`core`] fix silent bug `keep_in_fp32` modules (#26589)

e6d250e4

#26566 swin2 sr allow in out channels (#26568)

0a3b9d02

Don't close ClearML task if it was created externally (#26614)

9e78c9ac

Fix `transformers-pytorch-gpu` docker build (#26615)

9d206012

[docs] Update to scripts building index.md (#26546)

18fbeec8

Don't install `pytorch-quantization` in Doc Builder docker file (#26622)

75a33d60

Remove unnecessary `view`s of `position_ids` (#26059)

8878eb1b

Fixed inconsistency in several fast tokenizers (#26561)

af38c837

Update tokenization_code_llama_fast.py (#26576)

65aabafe

Remove unnecessary unsqueeze - squeeze in rotary positional embedding…

64845307

Update chat template docs with more tips on writing a template (#26625)

ea52ed9d

fix RoPE t range issue for fp16 (#26602)

87499420

Fix failing `MusicgenTest .test_pipeline_text_to_audio` (#26586)

e840aa67

remove SharedDDP as it is deprecated (#25702)

27597fea

[`LlamaTokenizerFast`] Adds edge cases for the template processor …

9ad815e4

[docstring] Fix docstring for `AlbertConfig` (#26636)

360ea8fc

docs(zh): review and punctuation & space fix (#26627)

897a826d

[DINOv2] Convert more checkpoints (#26177)

2629c8f3

Fixed malapropism error (#26660)

86a4e5a9

fix links in README.md for the GPT, GPT-2, and Llama2 Models (#26640)

8835bff6

Avoid CI OOM (#26639)

740fc6a1

fix typos in idefics.md (#26648)

c7f01bee

[docstring] Fix docstring CLIP configs (#26677)

3763101f

[docstring] Fix docstring for `CLIPImageProcessor` (#26676)

d2f06dff

[docstring] Fix docstring for DonutImageProcessor (#26641)

3257946f

Fix stale bot (#26692)

87b4ade9

[docstring] Fix docstrings for `CLIP` (#26691)

a5e6df82

Control first downsample stride in ResNet (#26374)

592f2eab

Fix Typo: table in deepspeed.md (#26705)

a9862a0f

[docstring] Fix docstring for `LlamaConfig` (#26685)

e8fdd787

fix a typo in flax T5 attention - attention_mask variable is misnamed…

975003ea

Fix source_prefix default value (#26654)

3eceaa36

[JAX] Replace uses of `jnp.array` in types with `jnp.ndarray`. (#26703)

fc639143

Make Whisper Encoder's sinusoidal PE non-trainable by default (#26032)

1e3c9dda

In assisted decoding, pass model_kwargs to model's forward call (fix …

dcc49d8a

Update docs to explain disabling callbacks using report_to (#26155)

9f406392

`Copied from` for test files (#26713)

5334796d

[Assistant Generation] Improve Encoder Decoder (#26701)

da69de17

[docstring] `SwinModel` docstring fix (#26679)

cc44ca80

fix the model card issue as `use_cuda_amp` is no more available (#26731)

69873d52

Fix stale bot for locked issues (#26711)

6ecb2ab6

Fix checkpoint path in `no_trainer` scripts (#26733)

1d6a8474

Update docker files to use `torch==2.1.0` (#26735)

b219ae6b

Revert #20715 (#26734)

e58cbed5

[docstring] Fix docstring for `LlamaTokenizer` and `LlamaTokenizerFas…

aaccf184

[docstring] Fix docstring for `CodeLlamaTokenizer` (#26709)

797a1bab

add japanese documentation (#26138)

9b7668c0

Translated the accelerate.md file of the documentation to Chinese (#2…

e1cec434

Fix doctest for `Blip2ForConditionalGeneration` (#26737)

3bc65505

Add many missing spaces in adjacent strings (#26751)

40ea9ab2

Warnings controlled by logger level (#26527)

ab0ddc99

Fix `PersimmonIntegrationTest` OOM (#26750)

72256bc7

Fix `MistralIntegrationTest` OOM (#26754)

db5e0c32

Fix backward compatibility of Conversation (#26741)

57632bf9

[docs] LLM prompting guide (#26274)

0ebee8b9

[docstring] Fix `UniSpeech`, `UniSpeechSat`, `Wav2Vec2ForCTC` (#26664)

eb734e51

[docstring] Update `GPT2` and `Whisper` (#26642)

b4199c2d

[docstring] Fix docstring for 'BertGenerationConfig' (#26661)

33df09e7

Fix `PerceiverModelIntegrationTest::test_inference_masked_lm` (#26760)

a243cdca

chore: fix typos (#26756)

883ed4b3

Skip `TrainerIntegrationFSDP::test_basic_run_with_cpu_offload` if `to…

3e93dd29

🌐 [i18n-KO] Translated `big_models.md` to Korean (#26245)

7790943c

Update expect outputs of `IdeficsProcessorTest.test_tokenizer_padding…

21da3b24

[docstring] Fix docstring for `RwkvConfig` (#26782)

d085662c

Fix num. of minimal calls to the Hub with peft for pipeline (#26385)

288bf5c1

[docstring] fix docstring `DPRConfig` (#26674)

5bfda28d

[`core`] Fix fa-2 import (#26785)

6df9179c

Disable default system prompt for LLaMA (#26765)

c9785d95

Fix Falcon generation test (#26770)

bdb391e9

Add OWLv2, bis (#26668)

762af3e3

Fixed KeyError for Mistral (#26682)

8e05ad32

[`Flava`] Fix flava doc (#26789)

7cc6f822

Add CLIP resources (#26534)

d6e5b02e

translation brazilian portuguese (#26769)

21dc5859

Fixed typos (#26810)

0dd58d96

[docstring] Fix docstring for `CanineConfig` (#26771)

0e52af4d

Add Japanese translation (#26799)

69a26c7e

[docstring] Fix docstring for `CodeLlamaTokenizerFast` (#26666)

5c081e29

Image-to-Image Task Guide (#26595)

5d997f22

Make fsdp ram efficient loading optional (#26631)

a5f5568d

fix resume_from_checkpoint bug (#26739)

b91cff5a

[OWL-ViT, OWLv2] Add resources (#26822)

570b3f9c

Add LLM doc (#26058)

805d5d21

Llama tokenizer: remove space in template comment (#26788)

3ef71345

Better way to run AMD CI with different flavors (#26634)

12cc1233

[docstring] Fix bert generation tokenizer (#26820)

5c6b83cb

Conversation pipeline fixes (#26795)

14b04b4b

🚨🚨🚨 [`Quantization`] Store the original dtype in the config as a priv…

fd6a0ade

Fix Mistral OOM again (#26847)

b8f1cde9

Chore: Typo fixed in multiple files of docs/source/en/model_doc (#26833)

b3961f72

fix: when window_size is passes as array (#26800)

85e9d644

Update logits_process.py docstrings to clarify penalty and reward cas…

0b8604d0

🚨🚨 Generate: change order of ops in beam sample to avoid nans (#26843)

4b423e60

[`FA2`] Fix flash attention 2 fine-tuning with Falcon (#26852)

41c42f85

🚨 🚨 Raise error when no speaker embeddings in speecht5._generate_spe…

db611aab

[docstring] Fix docstring for LukeConfig (#26858)

51042ae8

Fixed a typo in mistral.md (#26879)

46092f76

Translating `en/internal` folder docs to Japanese 🇯🇵 (#26747)

b002353d

Fix TensorFlow pakage check (#26842)

ef42cb62

Generate: improve docstrings for custom stopping criteria (#26863)

e893b1ef

Bump urllib3 from 1.26.17 to 1.26.18 in /examples/research_projects/v…

6d644d68

Bump urllib3 from 1.26.17 to 1.26.18 in /examples/research_projects/d…

bece55d8

Knowledge distillation for vision guide (#25619)

280c757f

Fix Seq2seqTrainer decoder attention mask (#26841)

34678db4

[`Tokenizer`] Fix slow and fast serialization (#26570)

ef7e9369

Emergency PR to skip conversational tests to fix CI (#26906)

de55ead1

Add default template warning (#26637)

d933818d

Refactor code part in documentation translated to japanese (#26900)

eec5a3a8

[i18n-ZH] Translated fast_tokenizers.md to Chinese (#26910)

732d2a8a

[`FA-2`] Final fix for FA2 dtype (#26846)

5a73316b

Add fuyu model (#26911)

caa0ff0b

[`FA-2`] Revert suggestion that broke FA2 fine-tuning with quantized …

574a5384

[docstring] Fix docstring for `ChineseCLIP` (#26880)

816c2237

[Docs] Make sure important decode and generate method are nicely disp…

734dd96e

Fix and re-enable ConversationalPipeline tests (#26907)

bdbcd5d4

[docstring] Fix docstrings for `CodeGen` (#26821)

ad08137e

Fix license (#26931)

73dc23f7

Pin Keras for now (#26904)

cbd278f0

[`FA-2` / `Mistral`] Supprot fa-2 + right padding + forward (#26912)

bc4bbd9f

Generate: update basic llm tutorial (#26937)

ae4fb846

Corrected modalities description in README_ru.md (#26913)

08a2edfc

[docstring] Fix docstring for speech-to-text config (#26883)

929134bf

fix set_transform link docs (#26856)

9b197669

Fix Fuyu image scaling bug (#26918)

c030fc89

Update README_hd.md (#26872)

224794b0

Added Telugu [te] translations (#26828)

093848d3

fix logit-to-multi-hot conversion in example (#26936)

f71c9ccf

Limit to inferior fsspec version (#27010)

70032949

python falcon doc-string example typo (#26995)

45425660

skip two tests (#27013)

ef978d0a

Nits in Llama2 docstring (#26996)

d33d3131

Change default `max_shard_size` to smaller value (#26942)

50d0cf4f

Add Seamless M4T model (#25693)

cb45f71c

[`NLLB-MoE`] Fix NLLB MoE 4bit inference (#27012)

244a53e0

[`SeamlessM4T`] fix copies with NLLB MoE int8 (#27018)

f9f27b0f

small typos found (#26988)

c0b5ad94

Merge branch 'main' of github.com:huggingface/transformers into add_udop

8f151eb6

fixups

4bdcc24d

more fixups

08685306

fix the tokenizers

6d98a920

remove un-necessary changes

dbbb0990

nits

536e339b

nits

24bc54a5

ArthurZucker merged c07e6e04 into add_udop 2 years ago

transformers
Add udop
#56

Merged

Add udop #56

transformers Add udop #56 Merged

Add udop #56

transformers
Add udop
#56

Merged