bitsandbytes
(reference-only) Multi backend refactor -> main (full diff of all already merged PRs)
#1220
Closed

(reference-only) Multi backend refactor -> main (full diff of all already merged PRs) #1220

Titus-von-Koeller wants to merge 283 commits into main from multi-backend-refactor
Titus-von-Koeller
jianan-gu minor fix
59facc84
jianan-gu final refinement
066d0dc3
pnunna93 Enable col to row transformation
657ca4bf
pnunna93 Add make functions for row to col transformation
a390e0c4
pnunna93 Update get_transform_buffer for row to col in HIP
99ad6b57
pnunna93 Update igemmlt for col format
039b8086
pnunna93 Unskip test_igemmlt_int on ROCm
1a052ee3
pnunna93 Update igemmlt_int test for col inputs
b7ca5cf7
pnunna93 Skip transpose igemmlt test on ROCm
a2cd90d1
pnunna93 Revert "Update igemmlt_int test for col inputs"
5b6c5ac3
pnunna93 Return nvidia_transform from transform for HIP
218bf662
pnunna93 Fix syntax error
8bb5c2f7
pnunna93 Add comment for shape change
eb2edf7e
pnunna93 Enable nvidia_transform tests
a38ea0fd
pnunna93 Merge branch 'fix_igemmlt_int' of https://github.com/pnunna93/bitsand…
fbacd7ac
pnunna93 Enable igemmlt_half tests
67c383bc
pnunna93 Revert col32 check in nvidia_transform test
42b860f3
amathews-amd Merge pull request #3 from pnunna93/fix_igemmlt_int
7198d6bb
pnunna93 Merge remote-tracking branch 'upstream/main' into IFU-master-2024-01-24
b1d484aa
Lzy17 Update README.md
c36085d6
pnunna93 Update hip files with upstream changes
0e91e481
pnunna93 Skip failing tests for now
1295d53c
amathews-amd Merge pull request #4 from ROCm/IFU-master-2024-01-24
48b7fa9a
iiisak ops.hip: adapt to enum naming changes in ROCm/hipBLASLt@95131d6 and R…
f1a0b8b3
jianan-gu Merge remote-tracking branch 'main/main' into upstream_device_abstrac…
e34c30ec
jianan-gu refine backend register with base-backend
cebd83c1
jianan-gu Merge remote-tracking branch 'main/main' into upstream_device_abstrac…
e0f2e185
jianan-gu minor clean format
d20c0176
Lzy17 fix wmma api parity
a84c369a
Lzy17 hipify wmma datatype
b044010a
jianan-gu Merge remote-tracking branch 'main/main' into upstream_device_abstrac…
9f233081
jianan-gu format in CI
b41c1c4d
jianan-gu minor fix for format
1ab611e8
jianan-gu refactor base backend registering
b933f9f1
jianan-gu refine structures of backends
8b4baaa4
jianan-gu fix import issue
0905ad74
jianan-gu minor clean
145a8357
pnunna93 Enable estimate quantile tests
7aa42bee
jianan-gu fix CI python format
d270832c
pnunna93 Merge pull request #5 from iiisak/rocm_enabled
85377e16
amathews-amd Merge pull request #7 from ROCm/fix_estimate_quantiles
ffb0c5db
Titus-von-Koeller fix py38 vers incompatibility from other PR
68e78590
Titus-von-Koeller update pre-commit
012b565d
Titus-von-Koeller cuda.py: harmonize whitespace
8fa27f60
Titus-von-Koeller delete dead code
2c04d482
Titus-von-Koeller fix whitespace
c1846557
Titus-von-Koeller fix typo
03b53d7e
Titus-von-Koeller remove exstraneous import
ba7a1620
Titus-von-Koeller factor out ensure_backend_is_available, exc instead of assert
d162998e
Lzy17 Merge pull request #6 from ROCm/rocwmma_merge
2b77380c
pnunna93 Enable transpose flag for row to col transform
fad79188
pnunna93 Update descriptors for transpose flag
e3021ee0
pnunna93 revert nvidia_transform to transform
8c3476f2
update changes
5e1b152d
jianan-gu Remove minor device filter to avoid confusion
2cd9718c
pnunna93 Merge pull request #8 from ROCm/enable_transform_with_transpose
386e16c2
fixed minor mistakes
389bb7d0
pnunna93 Merge pull request #9 from ROCm/rocm_enabled_fix_bfloat16
b6770bff
pnunna93 remove blocksize 64 on rocm
fa288281
pnunna93 remove block size 64 and enable remaining tests
d86d24cb
pnunna93 Fix cuda build errors
cf4a5066
pnunna93 remove workspace in igemmlt
70771956
pnunna93 Enabled igemmlt in matmul
ec32fc1c
pnunna93 Fix shape issue in transform function
4536b251
pnunna93 Enable igemmlt int8 output
66e34c18
pnunna93 Add col format for extract outliers
7e5e2231
pnunna93 Enable dequant_mm
2e42adb8
pnunna93 Enable matmullt tests
e32d2770
pnunna93 Enabled linear_serialization tests
8206bd18
pnunna93 fix error with dequant_mm change
973a9f8c
pnunna93 Enable extract outliers test
387a9b79
pnunna93 Enable test overflow
93dfb51a
pnunna93 Skip overflow and linear serialization for now
90bbdc60
pnunna93 Merge pull request #10 from ROCm/remove_blocksize_64
9890d5d4
pnunna93 Merge pull request #11 from ROCm/fix_cuda_build_errs
1b6dd482
pnunna93 Merge pull request #12 from ROCm/igemm_workspace
fc9bf4d7
pnunna93 Merge pull request #13 from ROCm/enable_matmul
f30dc38d
improve the gemv 4bit accuracy by forcing the hipcub to 32
3dc14e85
Lzy17 Merge pull request #14 from ROCm/fix_gemv_4bit
f4ac9ac1
pnunna93 Update skip comment
485ba8f8
pnunna93 Merge pull request #15 from ROCm/gemv_skip_comment
a36bd1d2
jianan-gu Merge remote-tracking branch 'tim/multi-backend-refactor' into upstre…
f26a4e6e
jianan-gu clean up device setup
adfb5e20
jianan-gu clean
6f08879a
jianan-gu fix utils
a9e45488
jianan-gu link QuantState in F.
84f67d26
Titus-von-Koeller pre-commit run --all-files
9ff6c638
Titus-von-Koeller Merge pull request #898 from jianan-gu/upstream_device_abstraction
2ffa3674
pnunna93 Merge remote-tracking branch 'upstream/main' into IFU-master-2024-03-28
a551c160
update instructions
a2672217
amathews-amd Merge pull request #19 from ROCm/updated_readme
bcdcc0b4
pnunna93 Update README.md
ff333714
pnunna93 Merge branch 'rocm_enabled' into IFU-master-2024-03-28
1157e734
pnunna93 fix PEP errors
702ca1ae
pnunna93 Fix typos
8c23dc01
pnunna93 Merge branch 'IFU-master-2024-03-28' of https://github.com/ROCm/bitsa…
971f4b1d
pnunna93 Fix formatting in README file
4d6408a6
matthewdouglas (backends) Stub out additional backends; move more functions to backe…
d62516f2
Xia-Weiwen Add int8 ops for Intel CPU & XPU
13ad630c
Xia-Weiwen Remove XPU code; remove cpu example; add UT
77be40bd
Xia-Weiwen Fix igemmlt correctness issue
8d0b695d
Xia-Weiwen Bug fix for double_quant
67d86611
Xia-Weiwen Remove torch.compile for double_quant
92900f6c
pnunna93 Update gpu arch setting
79cb5548
pnunna93 Add ROCM_PATH variable
5c0414e2
pnunna93 Add HIP_VERSION variable
47795f55
pnunna93 Add BNB_HIP_VERSION variable
6d904524
pnunna93 Update supports igemmlt based on HIP version
049a2dc5
pnunna93 Skip failing tests based on HIP version
47a0bc3b
pnunna93 pre-commit fixes
1b2a0951
pnunna93 Update README file
4515a218
Xia-Weiwen refine pytest.skip message
717245d4
pnunna93 Update default arch list
e7ef75fc
pnunna93 update readme
c0d244c9
lcskrishna Merge pull request #17 from ROCm/IFU-master-2024-03-28
c037a306
pnunna93 Merge remote-tracking branch 'TD_BnB/multi-backend-refactor' into dev…
73f4f059
pnunna93 update igemmlt for hip
79652a58
pnunna93 Update mm_dequant for hip
aedfa8fa
pnunna93 Update transform function for hip
7835282a
Xia-Weiwen Fix lint issues
93e04b5c
Xia-Weiwen Fix backward
e1b60d30
adding arch detection for test_gemv_eye_4bit
60d7560a
implement get_rocm_gpu_arch
cae33c38
fixing lint
da53f39f
fixing lint
ae4dcec5
correct lint error
21d5ff60
pnunna93 Merge pull request #21 from ROCm/rocm_enabled_arch_detect
5bada9ba
Titus-von-Koeller merge changes from main
7f13c8ff
Xia-Weiwen Fix lint issue
95c29a63
Titus-von-Koeller Merge pull request #1173 from matthewdouglas/backend-stubs
749e06f0
pnunna93 Merge branch 'rocm_enabled' into device_abstraction
01abfdeb
lcskrishna update extract_outliers, quantize_4bit, dequantize_4bit
765bfc83
lcskrishna minor fixes for extract_outliers
d00c026a
lcskrishna update blocksizes for quantize and dequantize
e5574bdc
Xia-Weiwen Update bitsandbytes/backends/cpu_xpu_common.py
b0dec0a5
Xia-Weiwen Merge remote-tracking branch 'upstream/multi-backend-refactor' into m…
97e41b88
Xia-Weiwen Fix lint issue
295bb973
Merge branch 'rocm_enabled' of https://github.com/ROCm/bitsandbytes i…
a00bd1f2
lcskrishna update reg expression for detecting arch
7ab3a054
lcskrishna linter updates
9cd1d8c7
lcskrishna Merge branch 'device_abstraction' into cl/update-device-abs
62f8ed96
Xia-Weiwen Fix lint issue
37b05821
Titus-von-Koeller Merge pull request #1178 from Xia-Weiwen/multi-backend-refactor-cpu-x…
8561f09e
Xia-Weiwen Support NF4 on CPU backend
09cc153d
pnunna93 Merge pull request #23 from ROCm/cl/update-device-abs
d9e48034
pnunna93 Merge remote-tracking branch 'upstream/multi-backend-refactor' into d…
2af8568d
pnunna93 skip linear no igemmlt test
06f6b251
pnunna93 Remove archive functional file
2359452d
pnunna93 Sync README with upstream
f76d6abc
pnunna93 Remove bnb_accuracy file
576b62cd
pnunna93 Remove cuda_setup
dfb531b7
pnunna93 Remove test_delete_later.c
31b1cbc5
pnunna93 Sync with upstream
ed774769
pnunna93 Sync files with upstream
943c57a2
pnunna93 Fix lint errors
71d17023
pnunna93 Exclude hip files from typo checks
6886bc8f
pnunna93 update ops.hip
0d445f4f
lcskrishna Merge pull request #27 from ROCm/dev_abs_IFU
bc6d0b7a
Xia-Weiwen Minor improvements
177bd398
pnunna93 Add install steps for ROCm
15c7f779
pnunna93 Fix lint error
d62c8358
lcskrishna Merge pull request #28 from ROCm/dev_abs_add_install_steps
8aae7c95
Xia-Weiwen Add fp4 support; add UT; fix lint issues
881b5fcd
Xia-Weiwen Reduce memory usage
dd157347
Xia-Weiwen Fix UT
85a01b00
Xia-Weiwen reduce memory usage for nf4
2c489f8d
pnunna93 Add comments for HIP changes
410f4998
Titus-von-Koeller Merge pull request #1206 from Xia-Weiwen/multi-backend-refactor-cpu-4bit
701c5aae
Titus-von-Koeller Merge pull request #1207 from ROCm/device_abstraction
eb3b816e
Titus-von-Koeller Titus-von-Koeller assigned Titus-von-Koeller Titus-von-Koeller 1 year ago
ji-huazhong Add empty stubs for Ascend NPU
ccee5d89
Titus-von-Koeller Merge pull request #1223 from statelesshz/backend-npu
09c314ab
Titus-von-Koeller Merge branch 'main' into multi-backend-refactor
2dbf8766
jiqing-feng fix blocksize
36fe1a0c
Titus-von-Koeller Merge pull request #1228 from jiqing-feng/4bit
dba83768
Xia-Weiwen CPU: add torch.compile for F.double_quant and F.quantize_4bit (#1238)
517eaf2b
Titus-von-Koeller cleanup docs-build breaking install instructs (#1244)
193120d1
Titus-von-Koeller provide temp flag for outside libs to detect multi-backend preview (#…
c79b1e92
Xia-Weiwen CPU/XPU: disable torch.compile if g++ is not available (#1251)
1bfecc81
pnunna93 Create build job for ROCm (#1255)
08597844
Titus-von-Koeller Changelog: add explanation r. QLoRA mem savings
9b726798
Titus-von-Koeller Titus-von-Koeller force-pushed the main branch from 774d0656 to 9b726798 1 year ago
Titus-von-Koeller Titus-von-Koeller force pushed from 985cbc21 to 63f5872b 1 year ago
Titus-von-Koeller merge `main` into `multi-backend-refactor`
056011a5
Titus-von-Koeller Titus-von-Koeller force pushed from 63f5872b to 056011a5 1 year ago
Titus-von-Koeller docs: add more details to Intel install
81375f8e
Titus-von-Koeller Titus-von-Koeller force-pushed the main branch from 9b726798 to 78007346 1 year ago
Titus-von-Koeller docs: cleanup of compilation instructions
24f7b652
Titus-von-Koeller docs: CHANGELOG.md fix
e3b27805
Titus-von-Koeller
Titus-von-Koeller Merge remote-tracking branch 'upstream/main' into multi-backend-refactor
0b53d317
jiqing-feng fix dtype mismatch (#1285)
c8b4b33e
Titus-von-Koeller allow features flags on bnb
d385aeaa
jiqing-feng Fix dequant 4bit (#1300)
452749a6
jiqing-feng fix loading int8 model in CPU (#1303)
a142f1eb
jiqing-feng fix transpose 4bit (#1301)
17750358
pnunna93 Enable bitsandbytes packaging for ROCm (#1299)
6d9b69b6
Titus-von-Koeller add bnb attribute to expose supported devices
bb438579
jiqing-feng fix 4bit dtype (#1325)
18668d29
Titus-von-Koeller docs: tweaks for multi-backend preview release prep
2bfa3472
Titus-von-Koeller docs: get started on detailed multi-backend guide
c8383fbf
jiqing-feng rm warn for multi backend (#1336)
3b94d626
Titus-von-Koeller actions: update permissions for pr docs publishing
39097a6f
jiqing-feng fix nf4 memory issue by init op_context in forward (#1349)
27846533
pnunna93 AMD: Clarify diagnostic messages; free up disk space for CI build
45b7d14a
jiqing-feng check grad before using ipex (#1358)
a23984fe
pnunna93 Enable packaging for ROCm 6.2 (#1367)
e8881bef
matthewdouglas Update for VS2022 17.11 compatibility with CUDA < 12.4 (#1341)
0d3d977c
matthewdouglas Enable continuous releases for multi-backend-refactor branch
e72637c9
matthewdouglas Update release workflow
662dc605
matthewdouglas Publish continuous release for multi-backend
3227cdd3
Titus-von-Koeller continuous release: revert wheel renaming due to install err
0a2b5392
Titus-von-Koeller Revert "continuous release: revert wheel renaming due to install err"
8c5499e7
Titus-von-Koeller add dynamic tag-based versioning + git hash for dev vers
02d5b423
Titus-von-Koeller docs: update w/ changes from `main`
6927dcc4
Titus-von-Koeller Titus-von-Koeller force pushed from c09603c5 to f495c7e7 1 year ago
Titus-von-Koeller Titus-von-Koeller force pushed from 0585a6a9 to fedd94e8 1 year ago
Titus-von-Koeller get tags for dynamic versioning
8dcd971c
Titus-von-Koeller Titus-von-Koeller force pushed from 0a2ecadf to 8dcd971c 1 year ago
Titus-von-Koeller fine-tune continuous release params
09ac7ec3
Titus-von-Koeller reduce the pkg size + build times for the preview release
cc56a30e
Titus-von-Koeller refine docs for multi-backend alpha release (#1380)
5225ebea
Titus-von-Koeller docs: remove 2 obsolete lines
e6cc1093
pnunna93 Remove depth option in installation steps (#1395)
cd3cb681
ji-huazhong Fix issue that no valid semantic version tag found when installing bi…
cd73601f
jiqing-feng Enable XPU and optimize cpu/xpu op (#1418)
b2ac4232
jiqing-feng fix cpu nf4 (#1432)
93156921
ji-huazhong Add Ascend NPU support for nf4 quant (#1422)
99483337
jiqing-feng fix device check (#1453)
7e6f8657
jiqing-feng Enable double quant on Intel CPU and XPU (#1472)
f6025bca
jiqing-feng Enable dequant+matmul 8bit path for Intel CPU and XPU (#1484)
307fbd52
faaany add device index (#1489)
a0a95fd7
matthewdouglas Sync branch with main; resolve conflicts.
ca299367
matthewdouglas Update base backend docstrings
ed2a58d2
matthewdouglas Update NPU backend with new spec
07c23de3
matthewdouglas Update CPU tests
94d60277
matthewdouglas ROCm: Fix compilation.
3fabd1a9
matthewdouglas Fix
d3ead1eb
matthewdouglas Build: use setuptools_scm for dynamic versioning compatibility with p…
6c4d8789
github-actions
matthewdouglas Update wheel build
2d06869e
matthewdouglas Add rocm6.3.2
7c917b0f
matthewdouglas setuptools_scm update
fdbbfb6f
jiqing-feng fix xpu woq linear dtype (#1506)
89373b8e
jiqing-feng fix version (#1532)
26407538
matthewdouglas matthewdouglas added Cross Platform
jiqing-feng enable benchmark script (#1554)
c66e1370
jiqing-feng update comments (#1562)
83c147de
jiqing-feng enable quant storage (#1563)
0cd87aaf
jiqing-feng fix meta device dispatch (#1564)
2354bdd0
jiqing-feng Enable XPU int matmul (#1547)
249a3cd0
jiqing-feng Fix XPU 4bit (#1567)
8fe63259
jiqing-feng Fix xpu to cpu (#1570)
d3658c54
jiqing-feng fix double compress 8bit precision (#1582)
d180d8e8
jiqing-feng Remove error log for Intel CPU/XPU (#1503)
54a2ad57
anadon
Liangliang-Ma XPU backend support 8bit optimizer (#1565)
5c48b333
ckvermaAI HPU support for bitsandbytes (#1592)
b090d85a
Titus-von-Koeller
jiqing-feng fix log (#1604)
5027e64a
jiqing-feng fix xpu ipex linear in torch2.7 (#1618)
263179a0
ckvermaAI update compute_type_is_set attr (#1623)
5e267f5f
Titus-von-Koeller Titus-von-Koeller changed the title (WIP) Multi backend refactor -> main (full diff of all already merged PRs) (reference-only) Multi backend refactor -> main (full diff of all already merged PRs) 223 days ago
rsshaik1 supports HPU double quant (#1630)
c3eac426
Titus-von-Koeller
Titus-von-Koeller Titus-von-Koeller closed this 117 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
Labels
Milestone