[loading] Correctly load params during offloading & careful memory considerations (#42632)
* do not load everything in advance
* fix
* fix
* fix
* fix
* fix memory leaks during conversion
* oupsi
* fix device_map
* add doc
* fix
* doc
* make it a method
* doc
* first shot at test
* fix test
* fix
* revert test: cpu mem too hard to track correctly
* fix