Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
huggingface/transformers
Pull Requests
Commits
clean-modeling
4.55.0-GLM-4.5V-preview-GLM-4.5V-release
4.55.0-GLM-4.5V-release
5-to-50gb
8a734ea2
29625_add_prefix_space
29625_prefix_space
30824-spmconverter-user-defined-symbol
35597_custom_tokenizer
BritneyMuller-housekeeping-patch
LysandreJik-patch-1
action_to_notify_new_model_push_backup
add_amd_daily_ci
add_back_generative_class
add_chat_template_tests
add_deformable_detr
add_eagle
add_fa2_bart
add_important_warning_padding_attention_mask
add_kosmos_2_remote
add_kosmos_2_temp
add_kosmos_2_utm5_attn_rebased_flat_layer_structure
add_kosmos_2_utm5_attn_rebased
add_kosmos_2_utm5_attn
add_mistral_common_support
add_num_workers_for_tf
add_pipeline_equivalence_testing
add_prefix_space_clean
add_qat_fp8
add_tf_export_doc
add_user_agent
add_word_level_timestamp_long
add-amd-runners-to-run-slow-command
add-block-tables
add-deci-lm
add-deepseek-exp
add-ep
add-eurobert
add-flash-decoding
add-fp8-llama-script
add-missing-truncate
add-mlx-llamacpp
add-model-arcinstitute-state
add-qgalore
add-rwkv5
add-spinquant
add-warning-4bit-opti
agent_callback
agents-count-tokens
agents-make-easier-tags
agents-messages
albertvillanova-patch-1
all_jobs_can_compare_against_prev_runs_clean_trigger
allow_ci_to_use_a10
allow_old_falcon_name
allow-disabling-compile
amd-hf-ci-branch
amd-nightly-ci
amdgpu-multi-gpu-tests
another_prepare_dataset_fix
api_big2
arijitx/wav2vec2_alignment
assistant_decoding_batch
attn-implementation-vision-enc-dec
auot-convert-tekken
auto_gpt4_conversion
auto-reparent-causal-lm-tests
base_8a734ea2
base-model-loading
batched_handle_empty_string
best_benchmark_new
best_benchmark_on_static_cache_new
best_benchmark_on_static_cache
best_benchmark
better_loss
better_sec
bloom_big
bnb-parametrize-4bit
bos_eos_token_fix
build_cache_test_files_in_docker
build_ci_docker_image_amd1
build_ci_docker_image_amd2
build_ci_docker_image_amd3
build_docker_on_kube
build_docker_on_kube_2
build_docker_on_kube_3
build_image_abc
build_image_circleci
build_with_2404
build-check-deepspeed-image
build-docker-torch-2.2
bump-hfh-prerelease
bump-pytest-asyncio
byebye_py_37
byebye
cache_exp
cache-refactor-1
cb6500r
change_build_input_tests
change_to_draft_2
change_to_draft_3
change_to_draft_4-release
change_to_draft_4
change-ci
chat_template_kwargs
chat-extra
chat-template-quick-fix
check_amd_image_build
check_bot
check_circleci_new_trigger
check_circleci_tokenizer
check_cleanup_workflow
check_commit
check_commit_3
check_compile_if_flaky
check_compile
check_doc_image
check_doc_test
check_docker_i
check_draft_4
check_ds
check_env_runner
check_example_ci
check_example_job
check_fa2
check_fastspeech
check_fix_fix_fix
check_fix_torch_pip
check_flaky
check_flax_example
check_force_n_layers
check_gated_repo
check_gemma_compile
check_gemma_compile_2
check_gemma
check_gen_2
check_kosmos2
check_layoutlm
check_layoutlmv2
check_limit
check_mem
check_mem_00c1d87
check_mem_3cefac1d
check_mem_56b64bf
check_mem_838b87a
check_nightly_build_build_image
check_nightly_build
check_nougat
check_onnx_2.7.1
check_past_runner
check_permission
check_push
check_quant2
check_quantized_param_bnb4
check_report
check_run_slow_v2
check_safetensors_rc
check_slow_pr
check_speed_no_empty
check_strange_doctest
check_temp
check_test_from_pretrained_low_cpu_mem_usage_equal
check_tiny_creation
check_torch_2.2
check_torch_2.9
check_torch_27
check_trigger_a81cf9ee
check_trigger_4d8427f7
check_update_cache_number
check_ved_trocr
check_what_wrong_in_tiny_creation
check_why_ci_killed
check-deepspeed
check-docstring
check-refactor-weight-loading
check-send-headers-when-converting-safetensors
check-v4.49-release
check-whisper-slow-tests
checkout-layoutlm-tokenizers
chunk_length_ctc
ci_crying_becausse_torchcodec
ci_with_commit_41b9b92b52215bed472c9a534a06abbc3a9a95cd
ci_with_torch_version_base
ci_with_torch_2.7_commit_0ef339ff1b63bb03a388c79bfbebec9085e10564
ci_with_torch_2.7
ci_with_torch_2.7.1_commit_0ef339ff1b63bb03a388c79bfbebec9085e10564
ci-amdgpu-build-docker-images
ci-amdgpu-mi250
ci-amdgpu-nightly
ci-test-huggingface-hub-0.29.0.rc6
ci-test-huggingface-hub-0.30.0.rc1
ci-test-huggingface-hub-v0.15.0.rc0
ci-test-huggingface-hub-v0.16.0.rc0
ci-test-huggingface-hub-v0.17.0.rc0
ci-test-huggingface-hub-v0.18.0.rc0
ci-test-huggingface-hub-v0.19.0.rc0
ci-test-huggingface-hub-v0.20.0.rc1
ci-test-huggingface-hub-v0.21.0.rc0
ci-test-huggingface-hub-v0.22.0.rc0
ci-test-huggingface-hub-v0.23.0.rc0
ci-test-huggingface-hub-v0.23.0.rc1
ci-test-huggingface-hub-v0.24.0.rc0
ci-test-huggingface-hub-v0.25.0.rc0
ci-test-huggingface-hub-v0.25.0.rc1
ci-test-huggingface-hub-v0.26.0.rc0
ci-test-huggingface-hub-v0.27.0.rc0
ci-test-huggingface-hub-v0.27.0.rc1
ci-test-huggingface-hub-v0.27.0rc1
ci-test-huggingface-hub-v0.28.0.rc0
ci-test-huggingface-hub-v0.28.0.rc5
ci-test-huggingface-hub-v0.29.0.rc0
ci-test-huggingface-hub-v0.29.0.rc1
ci-test-huggingface-hub-v0.29.0.rc2
ci-test-huggingface-hub-v0.29.0.rc5
ci-test-huggingface-hub-v0.29.0.rc7
ci-test-huggingface-hub-v0.29.3.rc0
ci-test-huggingface-hub-v0.30.0.rc3-release
ci-test-huggingface-hub-v0.31.0.rc0-release
ci-test-huggingface-hub-v0.32.0.rc0-release
ci-test-huggingface-hub-v0.32.0.rc1-release
ci-test-huggingface-hub-v0.33.0.rc0-release
ci-test-huggingface-hub-v0.34.0.rc0-release
ci-test-huggingface-hub-v0.35.0.rc0-release
ci-test-huggingface-hub-v0.35.0.rc1-release
ci-test-huggingface-hub-v1.0.0.rc0-release
ci-test-huggingface-hub-v1.0.0.rc1-release
ci-test-huggingface-hub-v1.1.0.rc2-release
ci-test-huggingface-hub-v1.2.0.rc0-release
ci-update
circleci_combine_reports
circleci_debug_base_MobileNetV1ModelTest_test_batching_equivalence
circleci_debug_base_timm
circleci_debug_base_timm_3
circleci_debug_base
circleci_with_torch_2.9_rc
clean_spmcoverter
clean-kwargs
clean-modeling
clean-param-size
cleanup_tok_save
cli-import-guards
cohere-diff
cohere-diff-2
collated-reports-fix-artifact-upload
compare_ci_with_torch_2.2
compare-test-results
compare-test-results-backup
compile-bnb
compile-rope
composable-tp
continuous-batching
copilot/sub-pr-41916-again
copilot/sub-pr-41916
copilot/sub-pr-41975
correct_topk_handling
correct-mapping-vision2seq-->-image-text-to-text
custom_bloom_kernel
custom-compute-loss-num-batches
databricks
dduf-compability
dduf-compatibility-with-file-explorer
deberta-xla-fixes
debug+_audio
debug_bloom
debug_circleci
debug_get_jobs
debug_kosmos_2_output
debug_mem_0b192de1
debug_mem_95b37495
debug_metadata_run
debug_tiny_model_creation
debug_too_long_no_output
debugdebug
debugdebug-2
debugdebug-5
deepseek_ocr
deepseek_v2_support
deepspeed-amd-pytorch-version-fix
default-fast-load
delete_big_tokenizer_block
'delete-delete-doc'
dependabot/pip/examples/flax/vision/torch-2.8.0
dependabot/pip/examples/tensorflow/language-modeling-tpu/transformers-4.50.0
deprecate_LegacyIndex
disable_multi_gpu
doc_builder_rename
doc_pr
doc-builder
doc-kernels
doc-link-one-or-two-papers-lysandre
docker_change_awq_version
docker-with-fa3
docs-ctrl-lys
document-serve-models
dont-report-num_input_tokens_seen
drop_py38_build_img_2
ds-fix-resume
ds-ignore_mismatched_sizes
_dummy_fix_weight_only_usage
_dummy_fix_weight_only_usage_2
dummy-pr
dynamic_length_in_static_cache_reconstruct_tensors_from_length
dynamic_length_in_static_cache
dynamic_length_in_static_cache_001
dynamic_length_in_static_cache_002
dynamic_length_on_b6eb708b
dynamic_length_on_0ae789e0
dynamic_length_on_75bbfd5b
dynamic_length_on_95b3c381
eagle_speculative_decoding
elie-temp-nope
enable_tf_numpy
example-explicit
examples-remove-telemetry
export-friendly
faster_cache_without_compile
faster_copies
faster_set_initialized_submodules
feat/add_cb_to_bench_ci
feat/add_otel_to_serve
feat/add_sliding_window_attention_to_cb
feat/add_sliding_window_attn
feat/continuous_batching_visualizer
feat/kv_cache_retention_across_convos
feat_pp_inference
feature/#35425
find-test-failure-diff-between-envs
fire
fix_TFMarianModelTest_test_xla_generate_slow
fix_a10_ci_00006
fix_aria_ci
fix_attention_name
fix_auto_batch_size_tests
fix_auto_test
fix_autoawq_docker
fix_autoawq_test
fix_check_copies
fix_chinese_clip
fix_convert_spm_bpe
fix_deprecation_warnings
fix_dinat_2
fix_dinov2
fix_docker_autoawq
fix_docker_autogptq_from_source
fix_docker_file
fix_doctest_based_on_refactor_doctest_2
fix_doctest
fix_eetq_test
fix_enable_grads_again
fix_falcon_processor
fix_flaky_test_assisted_decoding_matches_greedy_search
fix_flaky_test_eager_matches_sdpa_generate
fix_flaky_test_pt_tf_model_equivalence
fix_flaky_4
fix_fp_32
fix_generate_embeds
fix_gptq_test
fix_gptq_tests
fix_kosmos2
fix/lerobot_openpi
fix_mistral_3_clean
fix_module_conversion_util_ci
fix_moe_for_vllm
fix_more_input_out
fix_moshi
fix/next_token_never_returned
fix_nightly_ci_docker_build
fix_not_init
fix_offload_disk_gguf
fix_peft_model_in_pipelines
fix_pipe_tests_001
fix_print
fix_quanto_llama27b
fix_qwen2_5_omni_check
fix_remote_tool
fix_require_class
fix_sam_samhq
fix_security_issue_4
fix_security_1
fix_security_3
fix_shieldgemma2
fix_slow_gen_on_b6eb708b
fix_slow_gen_on_0ae789e0
fix_slow_gen_on_75bbfd5b
fix_st5_docs
fix_stupid_cond
fix_test_encode_decode
fix_test_fetcher_tests
fix_tie
fix_tie3
fix_tiny_gh
fix_weight_tying
fix_whatever
fix_whisper_ci
fix_whisper_tflite_export
fix_worker_crash
fix_worker_crash_2
fix_zh_quicktour_md
fix-Parameter-init
fix-Seq2SeqTrainingArguments-doc
fix-args-ordering-for-tp
fix-audio-pipeline-with-torchcodec-input
fix-autoprocessor-import-order
fix-casting
fix-ci
fix-ci-setup
fix-compressed-tensors
fix-config
fix-copies
fix-dataset-run_object_detection
fix-dataset-run_object_detection-and-add-torchcodec-trigger-ci
fix-deepspeed-torch-compile-for-training-test
fix-deqant-fp8
fix-doc-builder
fix-doc-builder-edit
fix-eager
fix-fallback
fix-flash-attention-with-static-cache
fix-flash-comment
fix-florence
fix-from-pretrained
fix-gemma2-sdpa
fix-gemma2-sliding-window
fix-gpt2-scaled-init
fix-helper
fix-int8-serilation
fix-kwargs-issues
fix-llama4-conversion
fix-llama-3-gguf
fix-mask-generation-auto-mapping
fix-modular-suffix
fix-mxfp4
fix-other-quants
fix-pipeline-predict-transform-methods
fix-pixtral-tests
fix-pytorch-deepspeed-image
fix-q-tests
fix-quality-2
fix-quantizer
fix-quants
fix-response-parsing-example
fix-slow-tests-shieh-trigger
fix-t5-gemma-config
fix-task-mappings
fix-torchao
fix-wav2vec
fix-weight-tied
fix-word-ids
fix-workflow
fixing_gptq_tests
flaky_generate_tests
flash-infer
flex_attention_qwen2
flex_attn_example
for_test_run_squad_no_trainer
for-a-green-ci
force_2_layers
fsdp2-checkpointing
full_length_on_68b71c85
full_length_on_468f7cca
full_length_on_862cde4c
full-bf16-train
gemma_allow_compile
general_test_low_cpu_mem
get_bad_commits_for_daily_ci_11_20
get_bad_commits_for_daily_ci_11_26
get-our-efficiency-back
gpt2
gpt-mqa
gptneo_gpt4_port_new
gptneo_gpt4_port
grab_captured_info
hardware-auto-setup
hardware-auto-setup-ci
hf-exporters
hotfix_ci_222
hqq_serialization
idefics3
ifix_aqlm_modules_to_not_convert
image-chunked-prefill
improve_error_message_asr_pipeline
improve_error_message_when_transformers_is_misconfigured
improve_torch_version_check
informative-detr-message
init_round_2
init_round_5
init-full-meta
int
inverse_chat_templates
jeffboudier-transformers-docs-ad-copy
jnp_devicearray
keras3_compatibility_phase_2
keras_3_compatibility
keras-core-support
kernel_config_nit
kwargs_in_every_forward_method
larger_runner
less-constraints
lhoestq-patch-3-10
lightweight-plugin-system
link-to-the-hub
llama4-unhardcode
llama-break-fix
llama-pad-side
llama-refactor
load_pretrainedfast_auto
logger_refactor
lower-logging-level
main
mark_whisper_test_slow
maybe_fix_qhwen
measure_all_tests
measure_gen_on_b6eb708b
measure_gen_on_0ae789e0
measure_gen_on_75bbfd5b
measure_gen
merge_text2text_into_text_generation
merge-kernel-tests
merging_to_test
metadata_job_2
migration-guide-disclaimer
minimaxm2-mirror
mistral3-xpu-cpu-offload
mitigate_tf_stride_vulnerability
mllama_integration_tests
mllama_new_outputs
model-docs
model-handling-tools
moe-attribute-map
moe-imp
more_info_ci_temp
more_reduced_dummy_memory_usage
more_tf_int_dtypes
more-cleaning
moshi-integration
move_jobs_from_daily_ci_channel
move_part_2
muellerzr-accum-plugin
muellerzr-ds-investigation
muellerzr-dummy-pr
muellerzr-enable-quant
muellerzr-enable-torchdata
muellerzr-fix-autocast
muellerzr-fix-integration-tests
muellerzr-fix-reentrant
muellerzr-fix-timeout
muellerzr-fixup-warning
muellerzr-free-memory
muellerzr-free-memory-passthrough
muellerzr-jobs
muellerzr-less-fixes
muellerzr-lr-sched-right-version
muellerzr-metrics
muellerzr-modeling
muellerzr-more-models-sadface
muellerzr-multinode-save
muellerzr-skip-dvc
muellerzr-skip-failing-example
muellerzr-speedup-modular-conversion
muellerzr-trainer-refactor
muellerzr-transformers-should-not-set-env-variables
muellerzr-use-scientific
multi_jobs_to_check_bad_commit
multiple-modular
mymain
new_blt
new-split
next-token
nezha_slow
nit_cleanup
nit-ga-condition
nit-modular-reame
nit-refactor
nit-remove-irrelevant-comment
nits-attention
no_more_shape_list
no_overwrite_test_batching_equivalence
no_repeat_kv
no-more-pointing-at-remote-repos
non-model-inits
noua/bloom_cugraph
nouamane/context-parallel
np2
offloading-optimizations
on_predict
one_tok_typing
one-class-to-rule-them-all
onnx_gpt2_io_definition
output_ragged
parallel
patch-ministral3
paulinebm-patch-1
pcuenca-patch-1
pin_ds
pin-ffspec
pin-gguf
ping_author
ping_author_6
pipeline-revision-mirror
pipelines_signatures
pixtral_batchmixfeature_fix
pixtral_processor_structure_fix
populated_deprecated_models
porting_jieba_dependency_to_rjieba
post-action-build-test-tokenizers-main
prefill-chunking
processor-template-duplicated-tokens
protobuf-4
push-callback
py39_typing_enforcement
quick-fix
quickfix_generate_tests
qwen2_tok_merges
qwen3-moe
random_dispatch
rc-tok
reenable_test
refactor_doctest_2
refactor-attention-converesion
refactor-bamba-tests-inherit-causal-lm-base
refactor-causal-lm-tests
refactor-from-pretrained-base-commit
refactor-gemma3n
refactor-tokenization
refactor-weight-loading-ben
refactoring-new-version
reference_VLM
relative-paths
remove_unused_test_attribs
remove-attributes-from-processors-ydshieh
remove-attributes-from-processors-ydshieh-base
remove-deprecated-stuff
remove-items
remove-script-datasets-in-tests
remove-script-datasets-in-tests-test-datasets-main
remove-slice
remove-tf-flax-readme
remove-torch-pre-releases-amd-image
remove-use_auth_token
remove-var-env
remove-warnings
rename_modular_attrs
repro-bug-pytorch-compile
repro-bug-pytorch-compile-cudagraph
reset_logger_level_2
resnet_with_variants
reverse_templating
reverse-conversion
revert_hard_error2
revert-17547-update-support-image
revert-17646-skip_repo_not_found
revert-31494-add_dac
revert-33934-patch-1
revert-37178-revert-loadibng-issue
revert-41610-ci_crying_becausse_torchcodec
revert-42258-revert-42213-fix_tests_being_slow_002
revert-42485-fix-docs-attmpt
revert-commit-30302
rm_pytorch_triton
robust_config_ckpt_check
rope-refactor-version-2-ydshieh
rope-refactor-version-2
run_add_tts_pip
run_amd_push_ci_caller
run_amd_scheduled_ci_caller_mi325
run_amd_scheduled_ci_caller_preview_image
run_amd_scheduled_ci_caller_test
run_amd_scheduled_ci_caller_testing
run_amd_scheduled_ci_caller_testing1
run_amd_scheduled_ci_caller
run_better_job_artifact_name
run_better_report
run_bon_courage
run_check_auto_mapping_importable
run_check_natten
run_ci_manually
run_ci_with_A10_and_torch_2.7.1
run_ci_without_kenlm
run_daily_ci
run_daily_ci_11_20
run_daily_ci_11_21
run_deepspeed_ci
run_doctest_after_merge
run_doctest_ci
run_ds_ci
run_fc639143
run_fc639143_001
run_fc639143_002
run_fc639143_003
run_feat/kv_cache_class
run_fix_doc_on_circleci
run_fix_error_not_captured
run_nightly_ci_test_new_runner
run_no_job_name
run_only_non_device_tests
run_past_ci_2nd
run_refactor_doctest
run_run_all_tests
run_run_amd_scheduled_ci_caller_deepspeed_test
run_scheduled_ci_now
run_scheduled_ci_now_2
run_scheduled_ci_now_3
run_scheduled_ci_now_4
run_scheduled_ci_now_5
run_scheduled_ci_now_6
run_sep_model_and_other_no_model_n_8_all_models
run_show_failure_better
run_split_daily_ci_based_on_no_job_name
run_split_daily_ci_based_on_no_job_name_2
run_tiny_with_fix_tiny_model_creation
run_torch_v_2_1
run_trigger_ci_when_tiny_summary_modified
run_truncate
run_update_important_models
run_update_tiny_002
run_use_main_in_conversion_script
run_with_info
run-amd
run-collated-reports-nvidia-ci
run-fix-Parameter-init
run-move-integrations
safe_ci_report
safe_serialization_always_valid
safetensors_pre_release
safetensors_rc
safetensors-0.4.2
safetensors-step-2-2
sam3
scale
secure-amd-ci
seed_test_load_balancing_loss
set-supports-static-cache-false-on-moes
shieh-length_in_compile
show_failure
simplify
simplify-contributions-init
simplify-contributions-inits
simplify-contributions-main-init
simplify-contributions-model_init
single-file-metadata
skip_flaky_test
skip_flaky_tests_double_check
skip_idefics_doctest
skip_internvl_tests
skip_2_hub_tests
skip-and-track-hidden-failures
skip-tokenizer-test
smangrul/fix-auto-batch-finder-trainer-issue
smangrul/integrate-accelerate
smangrul/starcoder-int4-ddp-flash-attn
smolvlm-video-processing
split_daily_ci
spm_converter
spmconverter_user_symbols
ssh_new_cluster
stable-adamw
starcoder-2-fix
stas00-patch-1
state_spaces_call_for_contribution
stop_repeating_setup
stop-ci-on-fail-doc
stop-throwing-cache-warning
submodels-support-check
support-copy
sync_dqa_pipeline
sync_token_classification_and_zero_shot
sync_vqa_pipeline
sync-table-question-answering
t5-fp16-no-nans
temp_get_new_failed_info
temp123
temp-disable-scheduled-amd-ci
temp-kosmos25
temporary_pin_torch_2
tensor-cache
test_bc_tokenizers
test_ci
test_composition_lysandre
test_composition_remote_tool
test_composition_2
test_doc
test_docker_run_quantization
test_fast_only_refactor
test_if_token
test_release_candidates
test_run_scheduler_ci
test_safetensors_abi3
test_safetensors_rc
test_safetensors
test_safetensors_0.5.0
test_safetensors_0.6.0
test_tokenizers_abi3
test_tokenizers_0.19.0rc0
test
test-bart-dummy
test-bin-format
test-build-ci-uv
test-datasets-2.14
test-datasets-2.21
test-datasets-3.0
test-datasets-main
test-datasets-pr
test-deepseek-fp8
test-doctests
test-eetq-dockerfile
test-fa2
test-fused-moe
test-huggingface_hub-pre-release
test-new-doc-builder-workflow
test-seentok
test-tokenizer-release
test-tokenizers-main
test-tp-old-version
tests-fetcher-test-all
text_generation_response_schema_fix
tf_forced_logits_xla_compatible
tf_int64_tests
tf_llama_port
tf_new_dummy_building
tf_quicktour_fix
tfconvnext
thomas/accelerate_gpt2
thomas/accelerate_gptj
thomas/add_custom_kernels
thomas/bloom_allow_fp32_lm_head
thomas/dirty_bloom_tp
thomas/fix_bloom
thomas/improve_bloom_generation_speed
thomas/llama
thomas/make_tp_bloom_generate_work
thomas/make_tp_work_with_bloom
tied_weights_bother
tied_weights_load
tied_weights_warning_check
timm_wrapper_kwargs
tiny-fixes-qwen2.5-vl
tok_refactor
tok-update
tokenizer-release
tokenizer-validation
tokenizers_prerelease
tokenizers_rc1
tool-handling
tools-inference-endpoints
torch_assert_close_not_fail_but_do_something
torch_versions
torch-2.2-on-daily-ci
tp-cb
tp-support
tp-test
track_hub_requests_and_fix
track_killed_jobs_2
trackio-trainer
trad_fixes
trainer-hyperparameter-search-kwargs-docs-update
transformers_nested_config
transformers-should-not-set-env-vars
trigger_all
trigger_all_2
trigger_build
trigger_ci_on_a10
trigger_ci_with_torch_2_4
trigger_ci_with_torch_2.8.0.rc_on_commit_ccf2ca16
trigger_ci_with_torch_2.8.0.rc_on_commit_6017f5e_temp
trigger_ci_with_torch_2.8.0.rc_on_commit_6017f5e8
trigger_ci_with_torch_2.9.0.rc_1_on_commit_bb45d36
trigger_ci_with_torch_2.9.1_rc_commit_8012f80f722044fd0dda45b4034f89fffc2ff344
trigger_daily_ci
trigger_debug
trigger_disable_multi_gpu
trigger_doc
trigger_for_pauline
trigger_pt_10_past_ci
trigger_run_amd_scheduled_ci_caller_deepspeed_test
trigger_slow
trigger_test_cached_model_has_minimum_calls_to_head
trigger_upload_artifacts_3
trigger_via_api_backup
trigger_688f4707bfc5f6adc6f4f18c2081c5a66db590d1
trigger-amd-image-build
trigger-remove-script-datasets-in-tests
try_comment_bot
try_cpu_offload
try_fix_whisper_slow_test
try_matrix_fail
try_new_natten
try_new_natten_2
try_pydantic_v2_build_images
try_run_amd_push_ci_caller
try_sub
untie_internvl
update_kosmos_2_file
update_llama_template
update_loss
update_siglip_tests
update_special_tokens
update_ssh
update_v5_guide_toks
update_27265
update-add-new-model
update-aria
update-converter
update-cooki
update-doc-gpu
update-min-safetensors
update-one-tok
update-quantization-docker
update-recommended-reviewers
update-special-tokens
update-tokenizers-version
update-tp-nits
upgrade_tokenizers2
upstream/main
use_new_tips
use_pt_25_image
use_style_bot_action
use_uv
use-hfh-loading-saving-state-dict-helpers
use-process-retry-on-amd-smi
use-safetensors-from-pr
use-uv-
utility_scripts
v3.5.1hotfix
v4
v4.0.0-release
v4.0.1-release
v4.2.1-patch
v4.2.2-patch
v4.3.0-release
v4.3.1-release
v4.3.2-release
v4.3.3-release
v4.4.2-release
v4.5.0-release
v4.5.1-release
v4.6.0-release
v4.8.0-release
v4.8.2-release
v4.9.1
v4.9.2-release
v4.10.0-release
v4.10.0
v4.10.1-release
v4.10.2-release
v4.10.3-release
v4.12.1-release
v4.12.2-release
v4.12.3-release
v4.12.4-release
v4.12.5-release
v4.14.1-release
v4.16.1-release
v4.16.2-release
v4.17.0-release
v4.18-release
v4.19-release
v4.20-release
v4.21-release
v4.22-release
v4.23-release
v4.24-release
v4.25-release
v4.26-release
v4.27-release
v4.28-release
v4.29-release
v4.30-release
v4.31-release
v4.32-release
v4.33-release
v4.34-release
v4.35-release
v4.36-release
v4.37-release
v4.38-release
v4.39-release
v4.40-release
v4.41-release
v4.42-release
v4.43-release
v4.44-release
v4.45-release
v4.46-release
v4.47-release
v4.48-release
v4.49-release
v4.49.0-AyaVision-release
v4.49.0-Gemma-3-release
v4.49.0-Mistral-3-release
v4.49.0-SigLIP-2-release
v4.49.0-SmolVLM-2-release
v4.50-release
v4.50.3-DeepSeek-3-release
v4.51-release
v4.51.3-BitNet-release
v4.51.3-CSM-release
v4.51.3-D-FINE-release
v4.51.3-GraniteMoeHybrid-release
v4.51.3-InternVL-release
v4.51.3-Janus-release
v4.51.3-LlamaGuard-release
cleanup
ArthurZucker
committed
171 days ago
a304370d
fix attentions
ArthurZucker
committed
171 days ago
e155ae85
todo comment
ArthurZucker
committed
171 days ago
cfa8c407
nits
ArthurZucker
committed
171 days ago
81b26782
small fix
ArthurZucker
committed
171 days ago
6c34e8a5
push what it should look like for llama
ArthurZucker
committed
171 days ago
eb727c50
up
ArthurZucker
committed
174 days ago
b1351120
some more cleanup
ArthurZucker
committed
175 days ago
f06e39ed
Most probably explicit register of the attention classes to prevent holding the references in the decoder layer. Will do that in a bit
ArthurZucker
committed
175 days ago
a4ec22ae
revert changes to the cursted libraries
ArthurZucker
committed
175 days ago
54242e21
more changes propagation
ArthurZucker
committed
175 days ago
f3341519
revert tf changes
ArthurZucker
committed
175 days ago
2d71cb33
more updats
ArthurZucker
committed
175 days ago
f1672862
updates
ArthurZucker
committed
175 days ago
ce609258
all the files this PR is scoped to tackle
ArthurZucker
committed
179 days ago
6099f095
modify align as well
ArthurZucker
committed
179 days ago
2dbf9a17
proper cleanup of albert
ArthurZucker
committed
179 days ago
675cd88e
fix
ArthurZucker
committed
179 days ago
2ab361bd
fix merge conflicts with main
ArthurZucker
committed
179 days ago
4853171a
models that we should update
ArthurZucker
committed
179 days ago
168db36c
fix return dict not even present
ArthurZucker
committed
179 days ago
531dc0c2
update
ArthurZucker
committed
179 days ago
2de1fcd0
more advanced api
ArthurZucker
committed
179 days ago
7d0ab1a1
fixes, and update mixtral
ArthurZucker
committed
179 days ago
39322ed1
pass compile exigences
ArthurZucker
committed
179 days ago
8fcc7915
oups
ArthurZucker
committed
179 days ago
19ef67ad
keep one check explicit
ArthurZucker
committed
179 days ago
b7ae7d92
update
ArthurZucker
committed
179 days ago
46b1b90b
lol one single nit
ArthurZucker
committed
179 days ago
71838e4a
?
ArthurZucker
committed
179 days ago
4e35168f
Older