llama.cpp
metal : fix build errors and rope kernel sig after #2268
#3898
Merged

metal : fix build errors and rope kernel sig after #2268 #3898

ggerganov merged 1 commit into master from fix-metal-after-yarn
ggerganov
ggerganov1 year ago (edited 1 year ago)👍 1

Not sure how this even compiled for other people. On M2 Ultra there were quite a few errors in the MSL code:

ggml_metal_init: allocating
ggml_metal_init: found device: Apple M2 Ultra
ggml_metal_init: picking default device: Apple M2 Ultra
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: loading '/Users/ggerganov/development/github/llama.cpp/ggml-metal.metal'
ggml_metal_init: error: Error Domain=MTLLibraryErrorDomain Code=3 "program_source:1073:11: error: pointer type must have explicit address space qualifier
    float * cos_theta, float * sin_theta
          ^
program_source:1073:30: error: pointer type must have explicit address space qualifier
    float * cos_theta, float * sin_theta
                             ^
program_source:1079:9: error: use of undeclared identifier 'ramp_mix'
        ramp_mix = rope_yarn_ramp(corr_dims[0], corr_dims[1], i0) * ext_factor;
        ^
program_source:1080:37: error: use of undeclared identifier 'ramp_mix'
        theta = theta_interp * (1 - ramp_mix) + theta_extrap * ramp_mix;
                                    ^
program_source:1080:64: error: use of undeclared identifier 'ramp_mix'
        theta = theta_interp * (1 - ramp_mix) + theta_extrap * ramp_mix;
                                                               ^
program_source:1083:33: error: use of undeclared identifier 'logf'
        mscale *= 1.0f + 0.1f * logf(1.0f / freq_scale);
                                ^
program_source:1085:18: error: use of undeclared identifier 'cosf'
    *cos_theta = cosf(theta) * mscale;
                 ^
program_source:1086:18: error: use of undeclared identifier 'sinf'
    *sin_theta = sinf(theta) * mscale;
                 ^
program_source:1172:33: error: use of undeclared identifier 'n_orig_ctx'
    rope_yarn_corr_dims(n_dims, n_orig_ctx, freq_base, beta_fast, beta_slow, corr_dims);
                                ^
program_source:1223:57: error: explicit instantiation of 'kernel_rope' does not refer to a function template, variable template, member function, member class, or static data member
template [[host_name("kernel_rope_f32")]] kernel rope_t kernel_rope<float>;
                                                        ^
program_source:1133:13: note: candidate template ignored: could not match 'void (const device void *, const device int32_t *, device float *, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, uint, uint3, uint3)' (aka 'void (const device void *, const device int *, device float *, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, unsigned int, uint3, uint3)') against 'void (const device void *, const device int32_t *, device float *, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, uint, uint3, uint3)' (aka 'void (const device void *, const device int *, device float *, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, unsigned int, uint3, uint3)')
kernel void kernel_rope(
            ^
program_source:1224:57: error: explicit instantiation of 'kernel_rope' does not refer to a function template, variable template, member function, member class, or static data member
template [[host_name("kernel_rope_f16")]] kernel rope_t kernel_rope<half>;
                                                        ^
program_source:1133:13: note: candidate template ignored: could not match 'void (const device void *, const device int32_t *, device float *, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, uint, uint3, uint3)' (aka 'void (const device void *, const device int *, device float *, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, unsigned int, uint3, uint3)') against 'void (const device void *, const device int32_t *, device float *, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, uint, uint3, uint3)' (aka 'void (const device void *, const device int *, device float *, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, unsigned int, uint3, uint3)')
kernel void kernel_rope(
            ^
" UserInfo={NSLocalizedDescription=program_source:1073:11: error: pointer type must have explicit address space qualifier
    float * cos_theta, float * sin_theta
          ^
program_source:1073:30: error: pointer type must have explicit address space qualifier
    float * cos_theta, float * sin_theta
                             ^
program_source:1079:9: error: use of undeclared identifier 'ramp_mix'
        ramp_mix = rope_yarn_ramp(corr_dims[0], corr_dims[1], i0) * ext_factor;
        ^
program_source:1080:37: error: use of undeclared identifier 'ramp_mix'
        theta = theta_interp * (1 - ramp_mix) + theta_extrap * ramp_mix;
                                    ^
program_source:1080:64: error: use of undeclared identifier 'ramp_mix'
        theta = theta_interp * (1 - ramp_mix) + theta_extrap * ramp_mix;
                                                               ^
program_source:1083:33: error: use of undeclared identifier 'logf'
        mscale *= 1.0f + 0.1f * logf(1.0f / freq_scale);
                                ^
program_source:1085:18: error: use of undeclared identifier 'cosf'
    *cos_theta = cosf(theta) * mscale;
                 ^
program_source:1086:18: error: use of undeclared identifier 'sinf'
    *sin_theta = sinf(theta) * mscale;
                 ^
program_source:1172:33: error: use of undeclared identifier 'n_orig_ctx'
    rope_yarn_corr_dims(n_dims, n_orig_ctx, freq_base, beta_fast, beta_slow, corr_dims);
                                ^
program_source:1223:57: error: explicit instantiation of 'kernel_rope' does not refer to a function template, variable template, member function, member class, or static data member
template [[host_name("kernel_rope_f32")]] kernel rope_t kernel_rope<float>;
                                                        ^
program_source:1133:13: note: candidate template ignored: could not match 'void (const device void *, const device int32_t *, device float *, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, uint, uint3, uint3)' (aka 'void (const device void *, const device int *, device float *, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, unsigned int, uint3, uint3)') against 'void (const device void *, const device int32_t *, device float *, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, uint, uint3, uint3)' (aka 'void (const device void *, const device int *, device float *, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, unsigned int, uint3, uint3)')
kernel void kernel_rope(
            ^
program_source:1224:57: error: explicit instantiation of 'kernel_rope' does not refer to a function template, variable template, member function, member class, or static data member
template [[host_name("kernel_rope_f16")]] kernel rope_t kernel_rope<half>;
                                                        ^
program_source:1133:13: note: candidate template ignored: could not match 'void (const device void *, const device int32_t *, device float *, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, uint, uint3, uint3)' (aka 'void (const device void *, const device int *, device float *, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, const constant float &, unsigned int, uint3, uint3)') against 'void (const device void *, const device int32_t *, device float *, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant int64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant uint64_t &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, uint, uint3, uint3)' (aka 'void (const device void *, const device int *, device float *, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant long &, const constant long &, const constant long &, const constant long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant unsigned long &, const constant int &, const constant int &, const constant int &, const constant float &, const constant float &, unsigned int, uint3, uint3)')
kernel void kernel_rope(
            ^
}
ggerganov metal : fix build errors and kernel sig after #2268
396412c0
ggerganov ggerganov merged 183b3fac into master 1 year ago
ggerganov ggerganov deleted the fix-metal-after-yarn branch 1 year ago
TortoiseHam
TortoiseHam1 year ago

I'm also getting an error when I try to quantize the llama2 model now, although it was working with an older version of the code base:

[ 1/ 723] token_embd.weight - [ 8192, 32000, 1, 1], type = f16, quantizing to q4_K .. zsh: illegal hardware instruction

ggerganov
ggerganov1 year ago (edited 1 year ago)

Have you recently upgraded to Sonoma?

Ever since I upgraded, K-quants are broken for me like this. This crash only occurs in Release (-O3) builds. Debug and -O2 work fine. Adding a print to debug this makes the issue disappear. So I have no idea how to fix it atm

My theory is something is wrong with the compiler.
If you can show me a commit where it works, I'll take a look. But atm I don't think this is llama.cpp related problem

pgeiss
pgeiss1 year ago👍 1

Thanks for this PR. I just cloned this project for the first time recently and ran into this issue. I thought I must have done something wrong when building! I pulled again and make clean; make and it works perfectly now.

ggerganov
ggerganov1 year ago😄 1

Yup, that's how we do it here - we test in production 😆

Synchro
Synchro1 year ago

I ran into the same thing yesterday on M1 Max running macOS 13.6, and can confirm that this is fixed here too.

cebtenzzre
cebtenzzre1 year ago👍 1

When I originally wrote this code, I had to ask a friend of a friend for remote access to his (Intel) Mac so I could verify that I even got the syntax correct.

My company got me an M2 Macbook, so I should be able to write better Metal code in the future.

Sorry for all the breakage 😅

ggerganov
ggerganov1 year ago

My company got me an M2 Macbook

eh.. should've went with the new M3 Macbook :)

TortoiseHam
TortoiseHam1 year ago😄 1

Have you recently upgraded to Sonoma?

Ever since I upgraded, K-quants are broken for me like this. This crash only occurs in Release (-O3) builds. Debug and -O2 work fine. Adding a print to debug this makes the issue disappear. So I have no idea how to fix it atm

My theory is something is wrong with the compiler.

If you can show me a commit where it works, I'll take a look. But atm I don't think this is llama.cpp related problem

Ah, yeah, I did just upgrade recently. If it works with a print message put in then maybe that is the solution... 😜

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone