llama.cpp
Fix conversion of unnormalized BF16->BF16 weights
#7843

Merged

Fix conversion of unnormalized BF16->BF16 weights #7843

compilade merged 10 commits into ggml-org:master from CISC:convert-bf16-fix

add truncate_bf16

6a52bfe3

truncate intermediate fp32 if converting bf16 to bf16

46054d1a

github-actions added python

mofosyne added Review Complexity : Low

fix masking in __compute_fp32_to_bf16

069369f3

np.int16 no longer used

225ec48f

missing cast and additional numpy 2.x fix

e8e2b7e0

ggml-impl : do not flush bf16 subnormals to zero

5b67a6cf

Merge branch 'master' into convert-bf16-fix

675a7410

github-actions added ggml

Merge branch 'master' of github.com:ggerganov/llama.cpp into convert-…

2b746488

missed prototype update in merge

dc051541

merge cleanup

3a3a7528

compilade approved these changes on 2024-08-01

mofosyne commented on 2024-08-01

mofosyne added merge ready

compilade merged b72c20b8 into master 1 year ago

CISC deleted the convert-bf16-fix branch 1 year ago

Reviewers

compilade

mofosyne

Assignees

No one assigned

Labels

Review Complexity : Low python ggml merge ready

Milestone

No milestone