Fix batched inference/generation, position_ids creation, falcon alibi, gpt_bigcode multi-query,.. #2326
test left-padded batched inference
63a6efe6
demonstrate batched tex generation failure
39496d8a
fix remote code
2ccc1503
fix
ecf65d55
fix position_ids generation inside ORTModelForCausalLM class
9f3eedc1
it works until transformers 4.52 -_-
b7bec5e4
now run with latest transformers
0df42e5b
bolean 4D mask is actually not supported by torch onnx exporter
999a145a
only test generation with batched inputs, for logits are a bit off be…
638856e8
boolean mask safe softmax batched inference
3d405020
style
023d2ac9
use old typing
accf8522
don't do unnecessary patching
0965ea93
try to avoid spamming the hub for an image
d1f9bbd2
update min transformers version
01c40843
better and direct torch patching
aeeecb2e
more batched generation special cases
fc62f420
style
ba994fbe
initialize the il image instead of downloading it
de6a798d
use random pil image
cf164b31
test different versions of transformers in fast tests
5934bf9f
fix
4b76f5e7
revert diffusers changes for now
e171196f
mask padding kv cache as well
5ab88b6a
fix masking for old bloom
4d35600b
use constant image to image loading errors
b2a5f411
style
3f58892a
test diffusers in series to avoid runner dying
b9d2e03d
fix
bdcc4252
cleanup and some comments
a3dc4e82
fix and test falcon alibi
a1ff2f2c
style
603f62c2
fix, support and test multi_query=False as well
cf5b562c
only apply masked testing for transformers version previous to 4.39
3a29549b
Update optimum/onnxruntime/modeling_decoder.py
af5fa34a
use text decoder position ids onnx config but test its sync with list
59c0c141
Merge branch 'fix-ort-batched-generation' of https://github.com/huggi…
b5d92e51
fix opt
9db07bf8
style
98123d49
echarlaix
approved these changes
on 2025-07-29
IlyasMoutawwakil
changed the title Fix ORTModelForCausalLM batched generation Fix batched inference/generation, position_ids creation, falcon alibi, gpt_bigcode multi-query,.. 242 days ago
fix sdpa without overriting torch onnx exporter
411df8f7
use inplace op ;-;
133f3409
Merge branch 'main' into fix-ort-batched-generation
90449484
fix st test
c98ab28a
patch directly in onnx because patch needs to happen after softmax
e787b921
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub