xla
[torchax]: JittableModule statedict handling
#9195

Merged

[torchax]: JittableModule statedict handling #9195

qihqi merged 2 commits into pytorch:master from zmelumian972:torchax/statedict_ascpu

zmelumian97299 days ago

torchax aims to improve seamless interoperability between torch and jax

one of the parts in torch training pipeline revolves around storing and loading statedict (checkpoints)

most of the objects revolving torch checkpoints expect a (non nested) dict containing weight name and it's value (in either CPU or GPU device)

since torchax tensors are held in jax container, torch checkpointers cannot easily handle it

this changes forces JittableModule to convert state_dict() functions (both load and get), making it seamless to the user when he wants to extract the statdict prior to saving it as a checkpoint

zmelumian972 changed the title ~~Torchax: JittableModule statedict handling~~ [Torchax]: JittableModule statedict handling 99 days ago

zmelumian972 changed the title ~~[Torchax]: JittableModule statedict handling~~ [torchax]: JittableModule statedict handling 99 days ago

qihqi requested a review from

qihqi 99 days ago

qihqi commented on 2025-05-20

Conversation is marked as resolved

Show resolved

Conversation is marked as resolved

Show resolved

Conversation is marked as resolved

Show resolved

[torchax] Support for JittableModule::state_dict()

0c045cbf

zmelumian972 force pushed from 43404e5d to 8b9bf46d 98 days ago

qihqi requested a review from

qihqi 98 days ago

qihqi requested a review from

qihqi 98 days ago

qihqi approved these changes on 2025-05-20

qihqi98 days ago

Thanks! Please fix the lint with

yapf -i -r *.py test/ scripts/ torch_xla/ benchmarks/ torchax/

Thanks

[torchax] Added JittableModule::load_state_dict mechanism

121cc008

zmelumian972 force pushed from 8b9bf46d to 121cc008 98 days ago

zmelumian97294 days ago

how do I move forward? I am unfamiliar with pytorch/XLA CI and resources

qihqi enabled auto-merge (squash) 85 days ago

disabled auto-merge 71 days ago
Manually disabled by user

qihqi enabled auto-merge (squash) 71 days ago

qihqi71 days ago

Hi @zmelumian972 can you rebase? The GPU CIs are not merge blocking anymore on the new HEAD. I also enabled auto-merge so if CI passes it should merge automatically. Thanks!

qihqi approved these changes on 2025-06-16

zmelumian97266 days ago

Done :)

disabled auto-merge 65 days ago
Manually disabled by user

qihqi merged afe425e2 into master 65 days ago

Reviewers

qihqi

Assignees

No one assigned

Labels

None yet

Milestone

No milestone

126	127
127	128	self._jitted[key] = call
	129
	130	def state_dict(self, args, *kwargs):

	148		"""
	149		Wrapper for load_state_dict
	150
	151		This function assumes torch CPU state dict and will transfer the parameters to the correct device

xla [torchax]: JittableModule statedict handling #9195 Merged

[torchax]: JittableModule statedict handling #9195

xla
[torchax]: JittableModule statedict handling
#9195

Merged