Add Flash Attention 2 support to Bark (#27364)
* change handmade attention mask to _prepare_4d_attention_mask
* add flashattention2 support in Bark
* add flashattention2 tests on BarkSemanticModel
* make style
* fix flashattention and tests + make style
* fix memory leak and allow Bark to pass flash attention to sub-models
* make style
* Apply suggestions from code review
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* remove unecessary code from tests + justify overriding
* Update tests/models/bark/test_modeling_bark.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* make style
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>