MI300 compatibility #1764

Narsil merged 58 commits into main from mi300-compat
fxmarty
fxmarty at last working!
3016e159
fxmarty
fxmarty commented on 2024-04-18
fxmarty tunableop in warmup
b503b3de
fxmarty wip fa2 triton & fix cudagraph bug
47e522a6
fxmarty WIP debug Triton FA2
0ca83be8
fxmarty working
f723e5cc
fxmarty _custom_C.LLMM1 and HIP_FORCE_DEV_KERNARG=1
1b4c8b4b
fxmarty
fxmarty commented on 2024-04-19
fxmarty cleaning
ec5343ec
fxmarty add missing files
8eacae01
fxmarty revert dev only changes
6d59eb2e
fxmarty fix
562cd4b0
fxmarty disable _custom_C.LLMM1 as it is broken for TP>=2
81c27ba9
fxmarty
fxmarty commented on 2024-04-19
fxmarty
fxmarty commented on 2024-04-19
fxmarty reenable _custom_C.LLMM1 as the culprit was FA2 triton
325f9774
fxmarty fix fa2 triton kernel not working with MQA/GQA
aef931ea
mht-sharma add LLMM_Silu
fbc5a6a1
mht-sharma black
e7289700
fxmarty Merge branch 'main' into mi300-compat
75023670
fxmarty run integration tests on rocm
b8da9024
fxmarty use released torch 2.3
193dbb68
fxmarty working & cached tunableop
17f5c307
fxmarty trying to update to ROCm 6.1
a5093606
fxmarty wip fix tunableop
2677bf85
fxmarty Merge branch 'main' into mi300-compat
8ec3b1a7
fxmarty working tunable
ff5e16b0
fxmarty Merge branch 'mi300-temp' into mi300-compat
d2b4b02c
fxmarty add model id
51b0c25f
fxmarty tunableop on 1,...,8
1f37d572
fxmarty remove unnecessary code
52f593bb
fxmarty cleanup dockerfile
c7074265
fxmarty more cleaning
6c385626
fxmarty fxmarty marked this pull request as ready for review 1 year ago
fxmarty fxmarty requested a review from Narsil Narsil 1 year ago
fxmarty fxmarty requested a review from OlivierDehaene OlivierDehaene 1 year ago
fxmarty fxmarty requested a review from drbh drbh 1 year ago
fxmarty
fxmarty ability to specify tunableop tuned lengths
caf07dec
seungrokj
mht-sharma add LLMM_Silu mistral
ca5ea451
fxmarty allow ROCM_USE_FLASH_ATTN_V2_TRITON=1
64e65ba3
fxmarty add debug dockerfile
cd313364
fxmarty disable _custom_C for debug purpose
f4dac978
fxmarty add rocm 6.0.2 dockerfile
b0c1fa65
fxmarty patch again amd_hip_bf16 since we downgraded to rocm6.0
f2fecdce
fxmarty fix
61b49859
fxmarty clean dockerfils
c5015ad6
fxmarty Merge branch 'main' into mi300-compat (WIP)
f32fdd0f
fxmarty update layers files
3b011ed3
fxmarty fix merge issues
c683597b
fxmarty fix various merge errors
b7e98ba6
seungrokj
fxmarty apply suggestions
f8d37c14
fxmarty Merge branch 'main' into mi300-compat
f2191247
fxmarty documentation
afc74733
fxmarty
fxmarty typo
0812e3bd
HuggingFaceDocBuilderDev
Narsil
fxmarty
fxmarty commented on 2024-05-16
Narsil
Narsil commented on 2024-05-16
fxmarty black
265c76d3
fxmarty update version
c9455730
fxmarty fixes on review
df0a4536
fxmarty refactor model_id, make tunableop default
a040a590
fxmarty reflect in doc that tunableop is default
c8475594
fxmarty remove unnecessary imports
7c6b9a09
fxmarty diff nicer
8d7f18f4
fxmarty nicer diff x2
3ded96fb
fxmarty cleanup fastlinear
956ac30a
fxmarty precise amd doc
2a7ba6ee
fxmarty cleanup dockerfile
eea32267
fxmarty
fxmarty tentatively fix build workflow
f5007ebc
Narsil Narsil merged 232e8d52 into main 1 year ago
Narsil Narsil deleted the mi300-compat branch 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone