IDEFICS: allow interpolation of vision's pos embeddings (#26029)
* add pos embed interpolation for vision encoder
* style
* update config with interpolate_pos_encoding arg
* fix imports formatting
* take off copied from on vision embeddings
* add test for image embeddings interpolation
* add credit for interpolation code
* Update src/transformers/models/idefics/configuration_idefics.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/idefics/vision.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fix condition to check nbr image patches match shape of pos embeddings
* use kwargs in the forward methods for interpolation
* fix tests
* have interpolate_pos_encoding default to False instead of None
* Update tests/models/idefics/test_modeling_idefics.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/idefics/test_modeling_idefics.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/idefics/test_modeling_idefics.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update src/transformers/models/idefics/configuration_idefics.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* take off for loop meant to print k,v
* add interpolate_pos_encoding arg in prepare_inputs_for_generation
* add test for interpolated generation
* fix edge case num_patches == num_positions and height == width
* add test for edge case
* fix pos_embed in interpolate
* allow interpolation in bf16 with upcasting
* Update src/transformers/models/idefics/vision.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/idefics/vision.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* add multiple images tests for interpolation and generation
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>