Choosing input/total tokens automatically based on available VRAM? #2673
drbh
commented
on 2024-10-21
Choosing input/total tokens automatically based on available VRAM?
a1aac784
Update doc.
79469f5f
Narsil
force pushed
from
b2272ab7
to
79469f5f
1 year ago
Remove generated files.
a31db047
Trying to fix non chunking targets.
0a01dde9
Attempt #2
5c3efbc7
fix.
82a6cb82
QuantLinear is rocm compatible.
849d8821
Much simpler logic after the overhead.
10534511
Updating logic + non flash.
6994fa12
Revert doc text.
cacaba64
Simple updates.
199973cc
Fix integration mt0 (transformers update).
e3db5259
drbh
dismissed these changes
on 2024-10-25
Merge branch 'main' into auto_length
c3fb2ecd
Narsil
dismissed their stale review
via c3fb2ecd
1 year ago
Narsil
merged
0c9b6cdd
into main 1 year ago
Narsil
deleted the auto_length branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub