bitsandbytes
(reference-only) Multi backend refactor -> main (full diff of all already merged PRs)
#1220
Closed
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
283
Changes
View On
GitHub
(reference-only) Multi backend refactor -> main (full diff of all already merged PRs)
#1220
Titus-von-Koeller
wants to merge 283 commits into
main
from
multi-backend-refactor
minor fix
59facc84
final refinement
066d0dc3
Enable col to row transformation
657ca4bf
Add make functions for row to col transformation
a390e0c4
Update get_transform_buffer for row to col in HIP
99ad6b57
Update igemmlt for col format
039b8086
Unskip test_igemmlt_int on ROCm
1a052ee3
Update igemmlt_int test for col inputs
b7ca5cf7
Skip transpose igemmlt test on ROCm
a2cd90d1
Revert "Update igemmlt_int test for col inputs"
5b6c5ac3
Return nvidia_transform from transform for HIP
218bf662
Fix syntax error
8bb5c2f7
Add comment for shape change
eb2edf7e
Enable nvidia_transform tests
a38ea0fd
Merge branch 'fix_igemmlt_int' of https://github.com/pnunna93/bitsand…
fbacd7ac
Enable igemmlt_half tests
67c383bc
Revert col32 check in nvidia_transform test
42b860f3
Merge pull request #3 from pnunna93/fix_igemmlt_int
7198d6bb
Merge remote-tracking branch 'upstream/main' into IFU-master-2024-01-24
b1d484aa
Update README.md
c36085d6
Update hip files with upstream changes
0e91e481
Skip failing tests for now
1295d53c
Merge pull request #4 from ROCm/IFU-master-2024-01-24
48b7fa9a
ops.hip: adapt to enum naming changes in ROCm/hipBLASLt@95131d6 and R…
f1a0b8b3
Merge remote-tracking branch 'main/main' into upstream_device_abstrac…
e34c30ec
refine backend register with base-backend
cebd83c1
Merge remote-tracking branch 'main/main' into upstream_device_abstrac…
e0f2e185
minor clean format
d20c0176
fix wmma api parity
a84c369a
hipify wmma datatype
b044010a
Merge remote-tracking branch 'main/main' into upstream_device_abstrac…
9f233081
format in CI
b41c1c4d
minor fix for format
1ab611e8
refactor base backend registering
b933f9f1
refine structures of backends
8b4baaa4
fix import issue
0905ad74
minor clean
145a8357
Enable estimate quantile tests
7aa42bee
fix CI python format
d270832c
Merge pull request #5 from iiisak/rocm_enabled
85377e16
Merge pull request #7 from ROCm/fix_estimate_quantiles
ffb0c5db
fix py38 vers incompatibility from other PR
68e78590
update pre-commit
012b565d
cuda.py: harmonize whitespace
8fa27f60
delete dead code
2c04d482
fix whitespace
c1846557
fix typo
03b53d7e
remove exstraneous import
ba7a1620
factor out ensure_backend_is_available, exc instead of assert
d162998e
Merge pull request #6 from ROCm/rocwmma_merge
2b77380c
Enable transpose flag for row to col transform
fad79188
Update descriptors for transpose flag
e3021ee0
revert nvidia_transform to transform
8c3476f2
update changes
5e1b152d
Remove minor device filter to avoid confusion
2cd9718c
Merge pull request #8 from ROCm/enable_transform_with_transpose
386e16c2
fixed minor mistakes
389bb7d0
Merge pull request #9 from ROCm/rocm_enabled_fix_bfloat16
b6770bff
remove blocksize 64 on rocm
fa288281
remove block size 64 and enable remaining tests
d86d24cb
Fix cuda build errors
cf4a5066
remove workspace in igemmlt
70771956
Enabled igemmlt in matmul
ec32fc1c
Fix shape issue in transform function
4536b251
Enable igemmlt int8 output
66e34c18
Add col format for extract outliers
7e5e2231
Enable dequant_mm
2e42adb8
Enable matmullt tests
e32d2770
Enabled linear_serialization tests
8206bd18
fix error with dequant_mm change
973a9f8c
Enable extract outliers test
387a9b79
Enable test overflow
93dfb51a
Skip overflow and linear serialization for now
90bbdc60
Merge pull request #10 from ROCm/remove_blocksize_64
9890d5d4
Merge pull request #11 from ROCm/fix_cuda_build_errs
1b6dd482
Merge pull request #12 from ROCm/igemm_workspace
fc9bf4d7
Merge pull request #13 from ROCm/enable_matmul
f30dc38d
improve the gemv 4bit accuracy by forcing the hipcub to 32
3dc14e85
Merge pull request #14 from ROCm/fix_gemv_4bit
f4ac9ac1
Update skip comment
485ba8f8
Merge pull request #15 from ROCm/gemv_skip_comment
a36bd1d2
Merge remote-tracking branch 'tim/multi-backend-refactor' into upstre…
f26a4e6e
clean up device setup
adfb5e20
clean
6f08879a
fix utils
a9e45488
link QuantState in F.
84f67d26
pre-commit run --all-files
9ff6c638
Merge pull request #898 from jianan-gu/upstream_device_abstraction
2ffa3674
Merge remote-tracking branch 'upstream/main' into IFU-master-2024-03-28
a551c160
update instructions
a2672217
Merge pull request #19 from ROCm/updated_readme
bcdcc0b4
Update README.md
ff333714
Merge branch 'rocm_enabled' into IFU-master-2024-03-28
1157e734
fix PEP errors
702ca1ae
Fix typos
8c23dc01
Merge branch 'IFU-master-2024-03-28' of https://github.com/ROCm/bitsa…
971f4b1d
Fix formatting in README file
4d6408a6
(backends) Stub out additional backends; move more functions to backe…
d62516f2
Add int8 ops for Intel CPU & XPU
13ad630c
Remove XPU code; remove cpu example; add UT
77be40bd
Fix igemmlt correctness issue
8d0b695d
Bug fix for double_quant
67d86611
Remove torch.compile for double_quant
92900f6c
Update gpu arch setting
79cb5548
Add ROCM_PATH variable
5c0414e2
Add HIP_VERSION variable
47795f55
Add BNB_HIP_VERSION variable
6d904524
Update supports igemmlt based on HIP version
049a2dc5
Skip failing tests based on HIP version
47a0bc3b
pre-commit fixes
1b2a0951
Update README file
4515a218
refine pytest.skip message
717245d4
Update default arch list
e7ef75fc
update readme
c0d244c9
Merge pull request #17 from ROCm/IFU-master-2024-03-28
c037a306
Merge remote-tracking branch 'TD_BnB/multi-backend-refactor' into dev…
73f4f059
update igemmlt for hip
79652a58
Update mm_dequant for hip
aedfa8fa
Update transform function for hip
7835282a
Fix lint issues
93e04b5c
Fix backward
e1b60d30
adding arch detection for test_gemv_eye_4bit
60d7560a
implement get_rocm_gpu_arch
cae33c38
fixing lint
da53f39f
fixing lint
ae4dcec5
correct lint error
21d5ff60
Merge pull request #21 from ROCm/rocm_enabled_arch_detect
5bada9ba
merge changes from main
7f13c8ff
Fix lint issue
95c29a63
Merge pull request #1173 from matthewdouglas/backend-stubs
749e06f0
Merge branch 'rocm_enabled' into device_abstraction
01abfdeb
update extract_outliers, quantize_4bit, dequantize_4bit
765bfc83
minor fixes for extract_outliers
d00c026a
update blocksizes for quantize and dequantize
e5574bdc
Update bitsandbytes/backends/cpu_xpu_common.py
b0dec0a5
Merge remote-tracking branch 'upstream/multi-backend-refactor' into m…
97e41b88
Fix lint issue
295bb973
Merge branch 'rocm_enabled' of https://github.com/ROCm/bitsandbytes i…
a00bd1f2
update reg expression for detecting arch
7ab3a054
linter updates
9cd1d8c7
Merge branch 'device_abstraction' into cl/update-device-abs
62f8ed96
Fix lint issue
37b05821
Merge pull request #1178 from Xia-Weiwen/multi-backend-refactor-cpu-x…
8561f09e
Support NF4 on CPU backend
09cc153d
Merge pull request #23 from ROCm/cl/update-device-abs
d9e48034
Merge remote-tracking branch 'upstream/multi-backend-refactor' into d…
2af8568d
skip linear no igemmlt test
06f6b251
Remove archive functional file
2359452d
Sync README with upstream
f76d6abc
Remove bnb_accuracy file
576b62cd
Remove cuda_setup
dfb531b7
Remove test_delete_later.c
31b1cbc5
Sync with upstream
ed774769
Sync files with upstream
943c57a2
Fix lint errors
71d17023
Exclude hip files from typo checks
6886bc8f
update ops.hip
0d445f4f
Merge pull request #27 from ROCm/dev_abs_IFU
bc6d0b7a
Minor improvements
177bd398
Add install steps for ROCm
15c7f779
Fix lint error
d62c8358
Merge pull request #28 from ROCm/dev_abs_add_install_steps
8aae7c95
Add fp4 support; add UT; fix lint issues
881b5fcd
Reduce memory usage
dd157347
Fix UT
85a01b00
reduce memory usage for nf4
2c489f8d
Add comments for HIP changes
410f4998
Merge pull request #1206 from Xia-Weiwen/multi-backend-refactor-cpu-4bit
701c5aae
Merge pull request #1207 from ROCm/device_abstraction
eb3b816e
Titus-von-Koeller
assigned
Titus-von-Koeller
1 year ago
Add empty stubs for Ascend NPU
ccee5d89
Merge pull request #1223 from statelesshz/backend-npu
09c314ab
Merge branch 'main' into multi-backend-refactor
2dbf8766
fix blocksize
36fe1a0c
Merge pull request #1228 from jiqing-feng/4bit
dba83768
CPU: add torch.compile for F.double_quant and F.quantize_4bit (#1238)
517eaf2b
cleanup docs-build breaking install instructs (#1244)
193120d1
provide temp flag for outside libs to detect multi-backend preview (#…
c79b1e92
CPU/XPU: disable torch.compile if g++ is not available (#1251)
1bfecc81
Create build job for ROCm (#1255)
08597844
Changelog: add explanation r. QLoRA mem savings
9b726798
Titus-von-Koeller
force-pushed the
main
branch
from
774d0656
to
9b726798
1 year ago
Titus-von-Koeller
force pushed
from
985cbc21
to
63f5872b
1 year ago
merge `main` into `multi-backend-refactor`
056011a5
Titus-von-Koeller
force pushed
from
63f5872b
to
056011a5
1 year ago
docs: add more details to Intel install
81375f8e
Titus-von-Koeller
force-pushed the
main
branch
from
9b726798
to
78007346
1 year ago
docs: cleanup of compilation instructions
24f7b652
docs: CHANGELOG.md fix
e3b27805
Merge remote-tracking branch 'upstream/main' into multi-backend-refactor
0b53d317
fix dtype mismatch (#1285)
c8b4b33e
allow features flags on bnb
d385aeaa
Fix dequant 4bit (#1300)
452749a6
fix loading int8 model in CPU (#1303)
a142f1eb
fix transpose 4bit (#1301)
17750358
Enable bitsandbytes packaging for ROCm (#1299)
6d9b69b6
add bnb attribute to expose supported devices
bb438579
fix 4bit dtype (#1325)
18668d29
docs: tweaks for multi-backend preview release prep
2bfa3472
docs: get started on detailed multi-backend guide
c8383fbf
rm warn for multi backend (#1336)
3b94d626
actions: update permissions for pr docs publishing
39097a6f
fix nf4 memory issue by init op_context in forward (#1349)
27846533
AMD: Clarify diagnostic messages; free up disk space for CI build
45b7d14a
check grad before using ipex (#1358)
a23984fe
Enable packaging for ROCm 6.2 (#1367)
e8881bef
Update for VS2022 17.11 compatibility with CUDA < 12.4 (#1341)
0d3d977c
Enable continuous releases for multi-backend-refactor branch
e72637c9
Update release workflow
662dc605
Publish continuous release for multi-backend
3227cdd3
continuous release: revert wheel renaming due to install err
0a2b5392
Revert "continuous release: revert wheel renaming due to install err"
8c5499e7
add dynamic tag-based versioning + git hash for dev vers
02d5b423
docs: update w/ changes from `main`
6927dcc4
Titus-von-Koeller
force pushed
from
c09603c5
to
f495c7e7
1 year ago
Titus-von-Koeller
force pushed
from
0585a6a9
to
fedd94e8
1 year ago
get tags for dynamic versioning
8dcd971c
Titus-von-Koeller
force pushed
from
0a2ecadf
to
8dcd971c
1 year ago
fine-tune continuous release params
09ac7ec3
reduce the pkg size + build times for the preview release
cc56a30e
refine docs for multi-backend alpha release (#1380)
5225ebea
docs: remove 2 obsolete lines
e6cc1093
Remove depth option in installation steps (#1395)
cd3cb681
Fix issue that no valid semantic version tag found when installing bi…
cd73601f
Enable XPU and optimize cpu/xpu op (#1418)
b2ac4232
fix cpu nf4 (#1432)
93156921
Add Ascend NPU support for nf4 quant (#1422)
99483337
fix device check (#1453)
7e6f8657
Enable double quant on Intel CPU and XPU (#1472)
f6025bca
Enable dequant+matmul 8bit path for Intel CPU and XPU (#1484)
307fbd52
add device index (#1489)
a0a95fd7
Sync branch with main; resolve conflicts.
ca299367
Update base backend docstrings
ed2a58d2
Update NPU backend with new spec
07c23de3
Update CPU tests
94d60277
ROCm: Fix compilation.
3fabd1a9
Fix
d3ead1eb
Build: use setuptools_scm for dynamic versioning compatibility with p…
6c4d8789
Update wheel build
2d06869e
Add rocm6.3.2
7c917b0f
setuptools_scm update
fdbbfb6f
fix xpu woq linear dtype (#1506)
89373b8e
fix version (#1532)
26407538
matthewdouglas
added
Cross Platform
enable benchmark script (#1554)
c66e1370
update comments (#1562)
83c147de
enable quant storage (#1563)
0cd87aaf
fix meta device dispatch (#1564)
2354bdd0
Enable XPU int matmul (#1547)
249a3cd0
Fix XPU 4bit (#1567)
8fe63259
Fix xpu to cpu (#1570)
d3658c54
fix double compress 8bit precision (#1582)
d180d8e8
Remove error log for Intel CPU/XPU (#1503)
54a2ad57
XPU backend support 8bit optimizer (#1565)
5c48b333
HPU support for bitsandbytes (#1592)
b090d85a
fix log (#1604)
5027e64a
fix xpu ipex linear in torch2.7 (#1618)
263179a0
update compute_type_is_set attr (#1623)
5e267f5f
Titus-von-Koeller
changed the title
(WIP) Multi backend refactor -> main (full diff of all already merged PRs)
(reference-only) Multi backend refactor -> main (full diff of all already merged PRs)
223 days ago
supports HPU double quant (#1630)
c3eac426
Titus-von-Koeller
closed this
117 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
No reviews
Assignees
Titus-von-Koeller
Labels
Cross Platform
Milestone
No milestone
Login to write a write a comment.
Login via GitHub