Prefill-related logic in input preparation for generation (#42088)
* add prefill arg in generation
* add a slow test
* fix copies
* can be like this but checking special tokens isn't good
* ig this solves the issue with assisted_gen+prefill
* update overwritten `prepare_inpits_for_generation`
* prefill is actually when we have no cache at all.. Try this for now
* first iteration is not always techincally same as prefill
* fix?
* fix now?
* update bloom
* fix smth
* make style
* fix copies and skip test
* fix copies
* tiny updates after a review
* fix other slow tests
* fix copies
* do not pass the same kwargs twice in prefill
* oops
* have to revert? prob fails only on dgx
* adjust slow test again
* address comments
* fix copies