Support inputs_embeds (#687)
* support inputs_embeds
* update tests to test inputs_embeds
* make iids optional inputs to fwd
* remove check for both iids and inputs_embeds
in MPTForCausalLM. It is checked in the base model, and it is actually a common practice to pass both during autoregressive generation. Embeds are used first, then once the kvcache is nonempty, iids are used instead
* reorder kwargs
* add more tests
* fix device merge artifact in test_model.oy
* fix generate test
* yapf