onnxruntime
Implement FlashAttention for CPU
#20805
Merged

Commits
  • Register new contrib op FlashAttention
    duanqn committed 1 year ago
  • Move getenv to constructor
    duanqn committed 1 year ago
  • Get Env
    duanqn committed 1 year ago
  • Renaming
    duanqn committed 1 year ago
  • Check for T==float
    duanqn committed 1 year ago
  • Lintrunner
    duanqn committed 1 year ago
  • Remove mlas function
    duanqn committed 1 year ago
  • Handle scale; Require present_k and present_v to be empty
    duanqn committed 1 year ago
  • Check is_unidirectional_
    duanqn committed 1 year ago
  • fix build
    duanqn committed 1 year ago
  • Merge with mlas.h
    duanqn committed 1 year ago
  • Add comment and MLASCALL
    duanqn committed 1 year ago
  • Remove unnecessary change
    duanqn committed 1 year ago
  • Fix onnxruntime_mlas.cmake
    duanqn committed 1 year ago
  • Pick onnxruntime/test/python/transformers/benchmark_mha.py from latest master
    duanqn committed 1 year ago
  • Disable FlashAttention by default
    duanqn committed 1 year ago
  • Fix value choice of row_size_q and row_size_kv; Add comments
    duanqn committed 1 year ago
  • Fix order
    duanqn committed 1 year ago
  • causal=False
    duanqn committed 1 year ago
  • Add MLASCALL on implementation
    duanqn committed 1 year ago
  • Improve comment
    duanqn committed 1 year ago
  • Enable FlashAttention by default
    duanqn committed 1 year ago
  • lintrunner -a
    duanqn committed 1 year ago
  • Remove memset
    duanqn committed 1 year ago
  • Fix l2_cache_size_
    duanqn committed 1 year ago
  • Fix PREfast
    duanqn committed 1 year ago
  • #include <algorithm>
    duanqn committed 1 year ago
  • Fix bug
    duanqn committed 1 year ago
  • lintrunner
    duanqn committed 1 year ago
  • Renaming
    duanqn committed 1 year ago
  • Renaming
    duanqn committed 1 year ago
  • Use MlasSgemmOperation
    duanqn committed 1 year ago
  • Move threading inside MLAS kernel
    duanqn committed 1 year ago
  • Remove MLASCALL
    duanqn committed 1 year ago
  • Remove 1 TODO
    duanqn committed 1 year ago
  • Renaming
    duanqn committed 1 year ago
Loading