Update expected values for one more `test_speculative_generation` aft…
5748352c
FIX(trainer): ensure final checkpoint is saved when resuming training…
564fde14
Add new model LFM2-VL (#40624)
c5325757
Fix outdated version checks of accelerator (#40969)
f6104189
Use `skip_predictor=True` in vjepa2 `get_vision_features` (#40966)
7cf1f5ce
[Trainer] Fix DP loss (#40799)
9378f874
[timm_wrapper] better handling of "Unknown model" exception in timm (…
6e51ac31
Fix Issue #39030: AutoTokenizer.from_pretrained does not propagate to…
2ce35a24
[tests] Really use small models in all fast tests (#40945)
dd7ac4cd
Add captured actual outputs to CI artifacts (#40965)
738b223f
Revert change in `compile_friendly_resize` (#40645)
d9d7f6a6
Track the CI (model) jobs that don't produce test output files (proce…
5ac3c517
Remove `set_model_tester_for_less_flaky_tests` (#40982)
5c2f5663
Benchmarking v2 GH workflows (#40716)
47c1a1b4
ENH: Enable readline support for transformers chat (#40911)
5a246131
[testing] test `num_hidden_layers` being small in model tester (#40992)
103fe0d5
blt wip (#38579)
a5ffae62
[`RMSNorm`] Fix rms norm init for models that center around 1 (#40796)
78f3e087
Make `EfficientLoFTRModelTest` faster (#41000)
a89ed714
Fix typoes in src and tests (#40845)
662ea950
Fix more dates in model cards and wrong modalities in _toctree.yml (#…
f73f73d4
RUFF fix on CI scripts (#40805)
6e1270d2
fix dict like init for ModelOutput (#41002)
251825aa
[tests] update `test_left_padding_compatibility` (and minimize overwr…
f47c6514
Patch more `unittest.case.TestCase.assertXXX` methods (#41008)
b164209d
🚨 [lightglue] fix: matches order changed because of early stopped ind…
d6d2d03b
Fix `PhimoeIntegrationTest` (#41007)
b2b50448
Fix Glm4v test (#41011)
e5a9a1de
Update after #41007 (#41014)
9de898e5
Fix benchmark runner argument name (#41012)
c1cf8dee
Adding support for Qwen3Omni (#41025)
41813d32
Making compute_loss_func always take priority in Trainer (#40632)
71f768bc
Modify Qwen3Omni parameter name since VL changed it (#41045)
23d0c62a
Fix Qwen video tests (#41049)
f1a8aff9
[testing] Fix `qwen2_audio` (#41018)
c6d3d0b9
Fix typing of tuples (#41028)
30dadfd5
Remove optax (#41030)
c931992d
Fix typos in English/Chinese documentation (#41031)
84600532
Use torch.autocast (#40975)
e6f5f948
docs: improved RoPE function Docstrings (#41004)
1ca91812
Fix condition for emitting warning when generation exceeds max model …
7425f6dc
Fix outdated torch version check (#40925)
9b221a84
Add Whole Word Masking and Padding Strategy to DataCollatorForLanguag…
c2c9074b
[testing] Fix `seed_oss` (#41052)
5fb3b354
Remove repeated import (#40937)
36911028
Simplify unnecessary Optional typing (#40839)
d43b73cb
Add write token for uploading benchmark results to the Hub (#41047)
9de77d70
Ci utils (#40978)
98e87dbf
Fix CI jobs being all red 🔴 (false positive) (#41059)
bdbe9878
Update quantization CI (#41068)
abbf0edd
[i18n-bn] Add Bengali language README file (#40935)
a9266c98
Improve documentation and errors in Mamba2-based models (#41063)
ed8d3aaa
Update team member list for some CI workflows (#41094)
fc974a97
fix crash when using chat to send 2+ request to gptoss (#40536)
dca053d1
Minor addition, no split modules for VideoMAEE (#41051)
ea92b1a0
Switch to `python:3.10-slim` for CircleCI docker images (#41067)
722be9f5
Fix argument name in benchmarking script (#41086)
e140ee3c
Fix typos in documentation (#41087)
9957b448
Fix typing (#40788)
281b8b62
Remove unused arguments (#40916)
72e7f343
fix wrong height and width when read video use torchvision (#41091)
93655f31
docs: Fix Tool Use links and remove dead RAG links (#41104)
c42b27b9
[tests] gpt2 + `CausalLMModelTester` (#41003)
9d9177f4
Fix `_get_test_info` for inherited tests (#41106)
8291a7fc
Remove bad test skips (#41109)
7bf0c7d3
Format empty lines and white space in markdown files. (#41100)
1f7c6535
Update ruff to 0.13.1 + target Python 3.10 + apply fixes (#37809)
a5a88829
Support loading LFM2 GGUF (#41111)
38c30bba
[torchao safetensors] integrate torchao safetensors support with tran…
f212a0b4
[Qwen3-next] Fix dimension mismatch in torch_chunk_gated_delta_rule a…
957b5568
Fix the error where a keyword argument appearing before *args (#41099)
7fde9757
Fix broken `` expressions in markdown files (#41113)
c6f31abf
Remove self-assignment (#41062)
48c8c8db
Fixed MXFP4 model storage issue (#41118)
25c8ac57
Fixed loading LongT5 from legacy checkpoints (#40724)
0bc795f8
dummy commit (#41133)
99630b85
Fix loading logic flaw with regards to unexpected and missing keys (#…
6e913fc9
Fix: align Qwen2.5-VL inference rope index with training by passing s…
477b7a3a
Fix single quotes in markdown (#41154)
287652a2
extend gemma3n integration ut cases on XPU (#41071)
174a5c4e
Add Parakeet (#39062)
53ce2f82
Fix format of compressed_tensors.md (#41155)
bd77d70e
Simplify and improve model loading logic (#41103)
6566998e
Force new vision models addition to include a fast image processor (#…
83fc0ee9
Add language specifiers to code blocks of markdown files (#41114)
d3d02925
Improve `add_dates` script (#41167)
46a138cd
Fix flash-attn for paged_attention when no kernels (#41078)
2e498266
Remove data from examples (#41168)
ac8703da
Enable fa in amd docker (#41069)
14b45582
handle flash slow tests (#41072)
d8152615
Modernbert fix (#41056)
cd154ae9
CI Runners - move amd runners mi355 and 325 to runner group (#41193)
56a74c39
[XPU] Add MXFP4 support for XPU (#41117)
9a76ebfa
[tests] `CausalLMTester` automatically infers other test classes from…
97ee50f3
More typing fixes (#41102)
deac4530
enable flex attention ut cases on XPU (#40989)
9b7c343f
fix(trainer): Avoid moving model with device_map (#41032)
469336de
Fix attention sink implementation in flex attention (#41083)
01d8cc0e
Separate docker images for Nvidia and AMD in benchmarking (#41119)
fae2d679
Make quantizers good citizens loading-wise (#41138)
3017f04b
[`Kernels Attention`] Change fallback logic to error out on explicit …
389115c1
Add EdgeTAM (#39800)
ec368a27
Fix EXAONE-4.0 dummy id (#41089)
068e7091
Fix 8bit bnb loading (#41200)
a2b6ccff
Fix docker quantization (#41201)
a2cdcccc
Embed interactive timeline in docs (#41015)
be826baa
[docs] Fix links (#41110)
090ad5db
Remove unnecessary Optional typing (#41198)
e0973709
docs/examples(speech): pin CTC commands to Hub datasets; add Windows …
5c0fd100
Fix Qwen3-Omni audio_token_id serialization issue (#41192)
f588aa82
Wait for main process in _save_checkpoint to ensure best checkpoint e…
4c54a98a
Avoid assumption that model has config attribute in deepspeed (#41207)
7e698ed8
Trainer: Pass `num_items_in_batch` to `compute_loss` in `prediction_s…
4886248d
[ESM] add accepts_loss_kwargs=False to EsmPreTrainedModel (#41006)
d1fd30d9
Align pull request template to bug report template (#41220)
9f7da26d
[generate] cache missing custom generate file (#41216)
8a23f340
Remove old Python code (#41226)
99fbb874
Adapt to the SDPA interface to enable the NPU to call FlashAttentionS…
4c6f26e0
update code owners (#41221)
3b03f55b
Unify is_torchvision_v2_available with is_torchvision_available (#41227)
12c4e6a6
Fix typing of train_args (#41142)
b0427d66
Fix sliding window attn mask (#41228)
50907d32
Revert "Fix DeepSpeed mixed precision precedence over Accelerate defa…
7db62848
[docs] Fix tp_plan (#41205)
86982f21
Fix white space in documentation (#41157)
ac69a8f1
fix qwen text config (#41158)
186a357f
Video processor accepts single frames on cuda (#41218)
46954e49
Use math.log2 (#41241)
25e26411
fix TrainerIntegrationDeepSpeed UT failures (#41236)
7076da63
[repo utils] Update `models_to_deprecate.py` (#41231)
9e60961e
Use removeprefix and removesuffix (#41240)
7aca328f
Fix pylint warnings (#41222)
c6af1ca2
Remove all instances of `is_safetensors_available` (#41233)
b7757de7
FP-Quant NVFP4 and Python 3.9 support (#39876)
c7616fdf
[`FA3`] Fix masking and loading logic in same process (#41217)
f672ee02
[t5gemma] fix `get_text_config` and related fixes (#40939)
066ca8e4
Don't convert to `safetensors` on the fly if the call is from testing…
e49d3d6b
Resolve remote custom module path warnings (#41243)
9ab2d57a
add peft team members to issue/pr template (#41262)
a6f470f6
docs: update bitsandbytes platform support (#41266)
19826920
add more activation kernels, follow up (#40944)
9e34b40e
fix asr pipeline ut failures (#41275)
e80da3a4
Use regex defailed flags (#41264)
d8566bc6
Fix multi-video timestamp bug in Qwen-3-VL and GLM4V (#41229)
d88a0fbb
Fix binding of video frames to video placeholder in `InternVL` model …
54c026ea
Deprecate Trackio environment variables and deploy to Spaces by defau…
03d976d1
Allow private Space id for Trackio (#40948)
37f1f5d7
fix async client for transformers chat (#41255)
247d21ad
Unify is_torchvision_v2_available with is_torchvision_available (#41259)
26c57efa
Use max/min (#41280)
91e1bdd0
Biogptlogits (#41270)
4f1faa06
Fix unnecessary single-item container checks (#41279)
9d67585e
Fix pylint generator warnings (#41258)
89d53495
feat: use `aws-highcpu-32-priv` for amd docker img build (#41285)
f8ec172c
Add processor and intergration test for qwen3vl (#41277)
a2de2937
Remove `test_initialization` (#41261)
27b9c795
Remove some previous team members from allow list of triggering Githu…
0995a484
Build doc in 2 jobs: `en` and `other languages` (#41290)
41eae7ad
Fix mxfp4 dequantization (#41292)
aca2380b
[`Flex Attn`] Fix lse x attention sinks logic (#41249)
531bb750
FIX: Bug in PEFT integration delete_adapter method (#41252)
cf88fbb6
Italian translation for README.md (#41269)
40329a8b
Fix README.md error when installing from source (#41303)
e656e264
download and use HF Hub Cache (#41181)
a6e9ec43
ArthurZucker
changed the base branch from
main
to
v4
176 days ago
fix some merge issues
010896e8
[test_all]
8270a0f8
[test-all]
e6d80873
LysandreJik
marked this pull request as ready for review 175 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub