Commits
  • at last working!
    fxmarty committed 1 year ago
  • tunableop in warmup
    fxmarty committed 1 year ago
  • wip fa2 triton & fix cudagraph bug
    fxmarty committed 1 year ago
  • WIP debug Triton FA2
    fxmarty committed 1 year ago
  • working
    fxmarty committed 1 year ago
  • _custom_C.LLMM1 and HIP_FORCE_DEV_KERNARG=1
    fxmarty committed 1 year ago
  • cleaning
    fxmarty committed 1 year ago
  • add missing files
    fxmarty committed 1 year ago
  • revert dev only changes
    fxmarty committed 1 year ago
  • fix
    fxmarty committed 1 year ago
  • disable _custom_C.LLMM1 as it is broken for TP>=2
    fxmarty committed 1 year ago
  • reenable _custom_C.LLMM1 as the culprit was FA2 triton
    fxmarty committed 1 year ago
  • fix fa2 triton kernel not working with MQA/GQA
    fxmarty committed 1 year ago
  • add LLMM_Silu
    mht-sharma committed 1 year ago
  • black
    mht-sharma committed 1 year ago
  • Merge branch 'main' into mi300-compat
    fxmarty committed 1 year ago
  • run integration tests on rocm
    fxmarty committed 1 year ago
  • use released torch 2.3
    fxmarty committed 1 year ago
  • working & cached tunableop
    fxmarty committed 1 year ago
  • trying to update to ROCm 6.1
    fxmarty committed 1 year ago
  • wip fix tunableop
    fxmarty committed 1 year ago
  • Merge branch 'main' into mi300-compat
    fxmarty committed 1 year ago
  • working tunable
    fxmarty committed 1 year ago
  • Merge branch 'mi300-temp' into mi300-compat
    fxmarty committed 1 year ago
  • add model id
    fxmarty committed 1 year ago
  • tunableop on 1,...,8
    fxmarty committed 1 year ago
  • remove unnecessary code
    fxmarty committed 1 year ago
  • cleanup dockerfile
    fxmarty committed 1 year ago
  • more cleaning
    fxmarty committed 1 year ago
  • ability to specify tunableop tuned lengths
    fxmarty committed 1 year ago
  • add LLMM_Silu mistral
    mht-sharma committed 1 year ago
  • allow ROCM_USE_FLASH_ATTN_V2_TRITON=1
    fxmarty committed 1 year ago
  • add debug dockerfile
    fxmarty committed 1 year ago
  • disable _custom_C for debug purpose
    fxmarty committed 1 year ago
  • add rocm 6.0.2 dockerfile
    fxmarty committed 1 year ago
  • patch again amd_hip_bf16 since we downgraded to rocm6.0
    fxmarty committed 1 year ago
  • fix
    fxmarty committed 1 year ago
  • clean dockerfils
    fxmarty committed 1 year ago
  • Merge branch 'main' into mi300-compat (WIP)
    fxmarty committed 1 year ago
  • update layers files
    fxmarty committed 1 year ago
  • fix merge issues
    fxmarty committed 1 year ago
  • fix various merge errors
    fxmarty committed 1 year ago
  • apply suggestions
    fxmarty committed 1 year ago
  • Merge branch 'main' into mi300-compat
    fxmarty committed 1 year ago
  • documentation
    fxmarty committed 1 year ago
  • typo
    fxmarty committed 1 year ago
  • black
    fxmarty committed 1 year ago
  • update version
    fxmarty committed 1 year ago
  • fixes on review
    fxmarty committed 1 year ago
  • refactor model_id, make tunableop default
    fxmarty committed 1 year ago
  • reflect in doc that tunableop is default
    fxmarty committed 1 year ago
  • remove unnecessary imports
    fxmarty committed 1 year ago
  • diff nicer
    fxmarty committed 1 year ago
  • nicer diff x2
    fxmarty committed 1 year ago
  • cleanup fastlinear
    fxmarty committed 1 year ago
  • precise amd doc
    fxmarty committed 1 year ago
  • cleanup dockerfile
    fxmarty committed 1 year ago
  • tentatively fix build workflow
    fxmarty committed 1 year ago
Loading