@Rocketknight1 So this PR should fix the issue i referenced in my last PR about subfolder params not working right. I hadn't understood that subfolder meant at the time. I'll verify that i can remove the chdir in my previous test case with this change.
@Rocketknight1 @amyeroberts Okay I updated the test from my previous PR to directly load from the arbitrary path instead of relying on the current directory and it seems to work now.
I think that this change is ready for your consideration.
Taking a look today/Monday!
Hi @rl337, I'm investigating this now. It seems clean, but can you give me an example of a repo where the auto_map
parameter has the form model_id--module.classname
, just so I can experiment with this? Not all repos with custom code use that structure, so I'd like to check with other people at Hugging Face what exactly the intended formatting and behaviour is for those fields.
@Rocketknight1 sure. The first place that I encountered this was https://huggingface.co/togethercomputer/StripedHyena-Nous-7B
@Rocketknight1 One thing that my fix doesn't address is cross model dependencies. It's something that i was considering trying out but would definitely cause issues with the current code.
Consider for a moment you have a "library model" which doesn't actually do anything but has the config and implementations of many models and tokenizers and then you refer to this model from other models. Consider the directory structure:
/path/to/my_id
/library_model
base_model.py
/model_a
config.json
/model_b
config.json
both model_a and model_b define their AutoModel map as: my_id/library_model--base_model.SomeCommonModel
Is that something we want to support here? If that's the case, i'd change the fix here to have a model_storage_dir instead of a model_id_or_path and use that to join against model_id/model
instead of using it as the path to the files themselves.
I could imagine this flexibility to be super useful.
Hi @rl337, firstly sorry for taking so long to try a reproduction here, but I'm struggling to figure out the issue with StripedHyena-7B
, or possibly I'm misunderstanding the problem. I've tried the following:
StripedHyena-7B
with AutoModelForCausalLM.from_pretrained("togethercomputer/StripedHyena-Nous-7B")
AutoModelForCausalLM.from_pretrained("togethercomputer/StripedHyena-Nous-7B", local_files_only=True)
AutoModelForCausalLM.from_pretrained("/path/to/local/dir", local_files_only=True)
In all of these cases, it seems to work fine. Can you help me out with some specific steps to reproduce the issue so I can dig into what's going on here?
okay here is the code block that fails for me with that model:
from transformers import AutoConfig, AutoTokenizer, AutoModel
model_path = '/mnt/model_storage/togethercomputer/StripedHyena-Nous-7B'
conf = AutoConfig.from_pretrained(model_path, local_files_only=True, trust_remote_code=True)
model = AutoModel.from_config(conf)
i'm running this from a subdir of my home directory and not anywhere in the path to the model.
The failure looks like this:
rlee@amalgam:~/dev/transformers$ PYTHONPATH=./src ./venv/bin/python test.py
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
Could not locate the configuration_hyena.py inside togethercomputer/StripedHyena-Nous-7B.
Traceback (most recent call last):
File "/home/rlee/dev/transformers/src/transformers/utils/hub.py", line 398, in cached_file
resolved_file = hf_hub_download(
File "/home/rlee/dev/transformers/venv/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/home/rlee/dev/transformers/venv/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1362, in hf_hub_download
raise LocalEntryNotFoundError(
huggingface_hub.utils._errors.LocalEntryNotFoundError: Cannot find the requested files in the disk cache and outgoing traffic has been disabled. To enable hf.co look-ups and downloads online, set 'local_files_only' to False.
if i edit the config.json to remove the togethercomputer/StripedHyena-Nous-7B--
part of the AutoConfig entries the code works.
if run the code from /mnt/model_storage where it coincidentally makes together/StripedHyena-Nous-7B a correct path relative to current directory, the code works.
Hi @rl337, the exact same code works for me! I had to replace AutoModel
with AutoModelForCausalLM
because AutoModel
wasn't in the auto_map
, but otherwise it was all fine. I think this might be some kind of environment issue. Can you try:
pip install --upgrade huggingface_hub
pip install --upgrade git+https://github.com/huggingface/transformers.git
Also, in the log you posted I see None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
Is it possible that the issue is that PyTorch classes like AutoModel
are just failing to initialize because torch
isn't present?
Yeah i didn't have PyTorch in that virtual environment but it gives me the same failure if PyTorch is installed.
Here is the run after installing the CPU only PyTorch so that we don't get that warning.
rlee@amalgam:~/dev/transformers$ PYTHONPATH=$HOME/dev/transformers/src $HOME/dev/transformers/venv/bin/python $HOME/dev/transformers/test.py
Could not locate the configuration_hyena.py inside togethercomputer/StripedHyena-Nous-7B.
Traceback (most recent call last):
File "/home/rlee/dev/transformers/src/transformers/utils/hub.py", line 398, in cached_file
resolved_file = hf_hub_download(
File "/home/rlee/dev/transformers/venv/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/home/rlee/dev/transformers/venv/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1362, in hf_hub_download
raise LocalEntryNotFoundError(
huggingface_hub.utils._errors.LocalEntryNotFoundError: Cannot find the requested files in the disk cache and outgoing traffic has been disabled. To enable hf.co look-ups and downloads online, set 'local_files_only' to False.
I was using the transformers in place so adding transformers/src to PYTHONPATH and also the checkout was upstream/main updated this morning so it's latest code. For the sake of consistency though i did the installs in the virtual environment as you suggested:
rlee@amalgam:~/dev/transformers$ ./venv/bin/pip install --upgrade huggingface_hub
Requirement already satisfied: huggingface_hub in ./venv/lib/python3.10/site-packages (0.20.3)
Collecting huggingface_hub
Downloading huggingface_hub-0.21.3-py3-none-any.whl.metadata (13 kB)
Requirement already satisfied: ...
Installing collected packages: huggingface_hub
Attempting uninstall: huggingface_hub
Found existing installation: huggingface-hub 0.20.3
Uninstalling huggingface-hub-0.20.3:
Successfully uninstalled huggingface-hub-0.20.3
Successfully installed huggingface_hub-0.21.3
and transformers
rlee@amalgam:~/dev/transformers$ ./venv/bin/pip install --upgrade git+https://github.com/huggingface/transformers.git
Collecting git+https://github.com/huggingface/transformers.git
Cloning https://github.com/huggingface/transformers.git to /tmp/pip-req-build-w2n2ou3s
Running command git clone --filter=blob:none --quiet https://github.com/huggingface/transformers.git /tmp/pip-req-build-w2n2ou3s
Resolved https://github.com/huggingface/transformers.git to commit 0ad770c3733f9478a8d9d0bc18cc6143877b47a2
Installing build dependencies ... done
Getting requirements to build wheel ... done
Installing backend dependencies ... done
Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: ...
Building wheels for collected packages: transformers
Building wheel for transformers (pyproject.toml) ... done
Created wheel for transformers: filename=transformers-4.39.0.dev0-py3-none-any.whl size=8593793 sha256=382de428f4f8fb87f4a918d7a957a800648d9da9fafefcc9cfbf55aad64d1ebd
Stored in directory: /tmp/pip-ephem-wheel-cache-8m3urbja/wheels/e7/9c/5b/e1a9c8007c343041e61cc484433d512ea9274272e3fcbe7c16
Successfully built transformers
Installing collected packages: tokenizers, transformers
Successfully installed tokenizers-0.15.2 transformers-4.39.0.dev0
doing this means i can remove the PYTHONPATH since transformers are now in the virtual environment.
Here's the re-run of the same code.
rlee@amalgam:~/dev/transformers$ $HOME/dev/transformers/venv/bin/python $HOME/dev/transformers/test.py
Could not locate the configuration_hyena.py inside togethercomputer/StripedHyena-Nous-7B.
Traceback (most recent call last):
File "/home/rlee/dev/transformers/venv/lib/python3.10/site-packages/transformers/utils/hub.py", line 398, in cached_file
resolved_file = hf_hub_download(
File "/home/rlee/dev/transformers/venv/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/home/rlee/dev/transformers/venv/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1397, in hf_hub_download
raise LocalEntryNotFoundError(
huggingface_hub.utils._errors.LocalEntryNotFoundError: Cannot find the requested files in the disk cache and outgoing traffic has been disabled. To enable hf.co look-ups and downloads online, set 'local_files_only' to False.
if i cd to /mnt/model_storage where the path becomes valid again... here is what the output looks like:
rlee@amalgam:/mnt/model_storage$ $HOME/dev/transformers/venv/bin/python $HOME/dev/transformers/test.py
Traceback (most recent call last):
File "/home/rlee/dev/transformers/test.py", line 5, in <module>
model = AutoModel.from_config(conf)
File "/home/rlee/dev/transformers/venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 439, in from_config
raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers_modules.StripedHyena-Nous-7B.configuration_hyena.StripedHyenaConfig'> for this kind of AutoModel: AutoModel.
which sounds like what you hit but this error is moot because we got past the loading of the model config object which is what this patch is about. If i change the AutoModel to AutoModelForCausalLM as you did, here is the output:
rlee@amalgam:/mnt/model_storage$ $HOME/dev/transformers/venv/bin/python $HOME/dev/transformers/test.py
The repository for /mnt/model_storage/togethercomputer/StripedHyena-Nous-7B contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co//mnt/model_storage/togethercomputer/StripedHyena-Nous-7B.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.
Do you wish to run the custom code? [y/N] y
so we hit the issue that we merged from my previous PR.
If we go back to the directory relative to my hom directory... we fail the way i noted when filing the bug.
rlee@amalgam:/mnt/model_storage$ cd ~/dev/transformers/
rlee@amalgam:~/dev/transformers$ $HOME/dev/transformers/venv/bin/python $HOME/dev/transformers/test.py
Could not locate the configuration_hyena.py inside togethercomputer/StripedHyena-Nous-7B.
Traceback (most recent call last):
File "/home/rlee/dev/transformers/venv/lib/python3.10/site-packages/transformers/utils/hub.py", line 398, in cached_file
resolved_file = hf_hub_download(
File "/home/rlee/dev/transformers/venv/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
return fn(*args, **kwargs)
File "/home/rlee/dev/transformers/venv/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1397, in hf_hub_download
raise LocalEntryNotFoundError(
huggingface_hub.utils._errors.LocalEntryNotFoundError: Cannot find the requested files in the disk cache and outgoing traffic has been disabled. To enable hf.co look-ups and downloads online, set 'local_files_only' to False.
It's very reproducible for me from a fresh virtual environment. Have you tried from a fresh virtual env?
I just tried with a fresh conda
install and it seemed to work fine - I got an import error from inside the modelling file:
ImportError: For `use_flash_rmsnorm`: `pip install git+https://github.com/HazyResearch/flash-attention.git#subdirectory=csrc/layer_norm
This clearly indicates that the modelling code was found and executed. I can't seem to reproduce this bug no matter where I put the repo directory, or where I call it from!
@Rocketknight1 yeah that's just because that model has a million crazy dependencies. You got past the failure that i'm running into.
I am starting to wonder if it's a python version issue. What version of python and conda do you use? i've been using 3.9.6 on the mac and on the linux box, it's 3.10.12. they are pretty vanilla installs because i always either containerize or use virtual environments. Neither of them are part of a conda package.
I was using Python 3.11
, conda 23.10.0
.
Still working on this. I just haven't had time lately.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
No stale please, bot! This is still a live issue
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
Gentle ping @Rocketknight1
I think we can close this - I wasn't able to reproduce the issue, and it seems environment-specific! It can be reopened later if we can get a reproducible error.
Login to write a write a comment.
When you have a model that's in a path that's not exactly the repo_id relative to the current directory and the config has AutoConfig of the form model_id--module.classname in it, you can't load the model using the path to the model because resolving module.classname ends up being relative to repo_id as defined in the config.
Let me lay this out.
What i want to do is, from my
__main__.py
load my model fromAutoModel.from_pretrained()
so I passpath/to/large/storage/models/model_a
as model_id_or_path withlocal_files_only=True
because i only want to use the specific model that i have on my filesystem.When you try to do this, you end up with an exception that looks like this:
Here, even though I'm trying to load the model from a path it's trying to resolve the code relative to the repo_id. This is problematic because if i made changes to the model code and i'm not restricting downloads, i may download and use out of date code from the hub. If i don't notice that it's downloaded code, it'd be super confusing to debug.
The workaround is to edit the config.json to remove the repo_id-- part of the definition which is kind of annoying because if you want to push changes to the hub after, you need to remember to add the repo_id-- back in.
I think the root problem here is when the model_id_or_path is specified as a path, really what it's doing is its acting like a path to the config.json and then it doesn't treat the directory it loads the config.json from to be a self contained model. It instead tries to resolve things defined in the config.json relative to the current directory or relative to the model hub/cache. Concretely it requires a directory structure that looks more like this:
When i run my
__main__.py
from the directory designated bypath/to/sourcecode
, everything seems to work okay because resolution of the model_idusername/model_id
happens relative to the current directory.What does this PR do?
This PR adds a check to see if the repo_or_path is a path containing the module file to download. If it is, load from that path instead of referencing the repo_id. It then tries to load the model.classname from the path rather than the repo_id when dynamically loading classes.
At this point, the config.json was already loaded from the path so likely the path is okay to load code from especially if trust_remote_code is true.. which it must be at this point.
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.