MI300 compatibility #1764
at last working!
3016e159
tunableop in warmup
b503b3de
wip fa2 triton & fix cudagraph bug
47e522a6
WIP debug Triton FA2
0ca83be8
working
f723e5cc
_custom_C.LLMM1 and HIP_FORCE_DEV_KERNARG=1
1b4c8b4b
cleaning
ec5343ec
add missing files
8eacae01
revert dev only changes
6d59eb2e
fix
562cd4b0
disable _custom_C.LLMM1 as it is broken for TP>=2
81c27ba9
reenable _custom_C.LLMM1 as the culprit was FA2 triton
325f9774
fix fa2 triton kernel not working with MQA/GQA
aef931ea
add LLMM_Silu
fbc5a6a1
black
e7289700
Merge branch 'main' into mi300-compat
75023670
run integration tests on rocm
b8da9024
use released torch 2.3
193dbb68
working & cached tunableop
17f5c307
trying to update to ROCm 6.1
a5093606
wip fix tunableop
2677bf85
Merge branch 'main' into mi300-compat
8ec3b1a7
working tunable
ff5e16b0
Merge branch 'mi300-temp' into mi300-compat
d2b4b02c
add model id
51b0c25f
tunableop on 1,...,8
1f37d572
remove unnecessary code
52f593bb
cleanup dockerfile
c7074265
more cleaning
6c385626
fxmarty
marked this pull request as ready for review 1 year ago
ability to specify tunableop tuned lengths
caf07dec
add LLMM_Silu mistral
ca5ea451
allow ROCM_USE_FLASH_ATTN_V2_TRITON=1
64e65ba3
add debug dockerfile
cd313364
disable _custom_C for debug purpose
f4dac978
add rocm 6.0.2 dockerfile
b0c1fa65
patch again amd_hip_bf16 since we downgraded to rocm6.0
f2fecdce
fix
61b49859
clean dockerfils
c5015ad6
Merge branch 'main' into mi300-compat (WIP)
f32fdd0f
update layers files
3b011ed3
fix merge issues
c683597b
fix various merge errors
b7e98ba6
apply suggestions
f8d37c14
Merge branch 'main' into mi300-compat
f2191247
documentation
afc74733
typo
0812e3bd
Narsil
commented
on 2024-05-16
black
265c76d3
update version
c9455730
fixes on review
df0a4536
refactor model_id, make tunableop default
a040a590
reflect in doc that tunableop is default
c8475594
remove unnecessary imports
7c6b9a09
diff nicer
8d7f18f4
nicer diff x2
3ded96fb
cleanup fastlinear
956ac30a
precise amd doc
2a7ba6ee
cleanup dockerfile
eea32267
tentatively fix build workflow
f5007ebc
Narsil
merged
232e8d52
into main 1 year ago
Narsil
deleted the mi300-compat branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub