transformers
6f6095e0 - Refactor weight loading (#41580)

Commit
38 days ago
Refactor weight loading (#41580) * ah actually we don't discard lm head if missing -> needs to be moved to correct device and etc * fix some tests * small fixes * up * up * dik why we tie weights twice but,..,,. * ups * removeunused * fix hunyuan * small fix * nits * ish * up * rev * fix more tie weights keys * small fixes * nit * update * fix and fix * fix a test * glubs * current shitty changes * ship validated ones * more * more update * more * more * more * mllama * more up * fix ernie * fix xopies * up more * more fixes * up * up * fix-copies * fix more * more updates * AI UPDATE * up * hoey * make it fast * fix * lol * fix asjusting * more fixes * _dtype nit * up * nit * update * update * remove semaphores * fix import to avoid jit execution * try to remove custom tiing logic when its stupid * fix more individual models * fix whisper as well * fix? * fox umt5 * improve tqdm bar * cleanup a bit * oupsi * some updates * improve * remove all buffering -> much faster without it * remove some tie_weights custome funcs when not needed * more fixes related to strict matching regex * remove ALL custom tie weights * small update * revert change to init scheme (no need for params) * mixtral init * try less strict source check * tied weight first shot to the fiiiixxxxxx * does this help? * :) * fix some ppolry defined tied_weights_keys for now * subclass nn.Parameters * up * lol * Ouiiii * fix led * fix long cat flash * fix qwen and long cat flash * properly fix qwen init * just push this for now * propnet is dumb * update * push * remove explict sharing of some tied keys. * update decoder.bias * moe case * more changes to untangle old hardcoded ting * fixup * fix big faileurs * fix prophnet * fix resize token embeddings * nits * fix xcodex * asyncio? * fix smart apply * fix data-2-vec * [build-ci-image] * checkout * uupdate * fix hunyuan * update error message * fix deformable detr * fixes * fix init weights for non param gate up projs * shared todo? * update some models * big revert, don't break this behaviour * ty @SunMarc this fixes the buffers Co-authored-by: SunMarc <SunMarc@users.noreply.github.com> * mt5 fuck * fix lxmbert * nuke slow test fetcher * fix zamba and deepcopy for now * fix zamba tied weight keys! ~ * fix-copies * update fetch terst * fix gradient for test modeling common! * break "shared" for now I will fix tomorrow changes are properly isoalted now :) * does this fix marian? probably not * fix some vlms * D fine seems to handle this well * glob is fine actually * fix dab detr * small steps * opusy * fix some more models? * yups * better erro * fix? * fix double escape * escape wehere it makes sense * ?? * fix ibert * fix tvp as well * more fxes * try always download ref PR * ONONONO * big fixup * more fixup * small step * small nits * nits * brut force some stuff * fix vilt * make sure special models that always need tie always tie * cleaning up * small nits * fix zamba and bridge tower! * just fixup * potential culprits * revert bark and fix bridgetower * remove now non existant tie_weights * ? * lol reformer actually had nothing tied! * wow these two fucking models were really not well made * fix sam family! * fix bark revision * fix speech2test ? * push this for now.... * upsy * the fuck * fix rtdetr * update * proper * wow that one 's annoying * update * try to find the culprit * get some help on common * nit about general init and cls.padding_idx * revert num workers update * remove old loading func * fix glob * add annotations * fix re * small improvements * clean some stuff * improvements * someone did not understannnnnnd what I tried to dooo or does BNB not support that either? * gluos * fix case when `.` is just not there * remove unused arg * recover orignal parameter/buffer using _original * fix glob issu * this? * deepspeed best-effort * remove unused stuff * Update tie weight keys as they were just wroong Co-authored-by: Benjamin Bossan <benjaminbossan@users.noreply.github.com>" * up * augustuc clauss, a gloubs gloups gloubs * fixup * fixup * there was fucking typo * mrain * nits * fix marian 3 remaining tests * one more * fix some of the copies, not all :) * small cleanup * one propertest * fix core model loadig tes * attempt a new test * fix some of the annoying tests by supporting reading .bin sometimes * push * push more small fixes * remove 1 useless test * up * fix audio flamingo post rebase * fixup * some small updatess * fix sam models * nits * up * updates * onem ore * skip this stupid test * some other fixes * fixup * update * skip more offloaded stuff * oups * ups * update mixtral * skip this one * LET"SGO * fixup * rope delta order * fix csm * small nit --------- Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com> Co-authored-by: SunMarc <SunMarc@users.noreply.github.com> Co-authored-by: Marc Sun <marc@huggingface.co>
Author
Parents
Loading