transformers
4b2a9216 - Refactor core_model_loading to support FSDP shard-on-read loading

Commit
5 days ago
Refactor core_model_loading to support FSDP shard-on-read loading - Add FSDPShardOperation and unified get_parallel_materialization_context - Load FSDP params by reading live DTensor metadata (device_mesh, placements) - Add get_parameter_or_buffer lookup with FSDP safety guard - Add concretize_target_patterns for wildcard converter isolation - Add summarize.md documenting the FSDP loader design
Author
Committer
Parents
Loading