The llama.cpp
reorganization should be now mostly ready and can be tested. The ggml
code, tests and cmake scripts will be shared across all 3 repos (as if it is a git submodule).
Some CI workflows are still failing - any help with resolving these is appreciated. I'll now focus on updating whisper.cpp
to reuse ggml
in a similar way and then look to merge this sometime next week. Extra attention to the new build options (e.g. LLAMA_CUDA
is now GGML_CUDA
)
The windows build is failing because some targets do not link to the libraries they need. On linux that works because in the end all the symbols are put together indiscriminately, but not on windows.
This solves some of the problems, but some tests depend on internal functions such as unicode_cpt_to_utf8
of llama.cpp that are not exported. But I am not sure if this a real solution, because with static builds I think this is going to cause multiple versions of ggml and llama to be linked in some binaries.
diff --git a/common/CMakeLists.txt b/common/CMakeLists.txt
index c6fccc02..299d53c3 100644
--- a/common/CMakeLists.txt
+++ b/common/CMakeLists.txt
@@ -85,4 +85,4 @@ endif ()
target_include_directories(${TARGET} PUBLIC .)
target_compile_features (${TARGET} PUBLIC cxx_std_11)
-target_link_libraries (${TARGET} PRIVATE ${LLAMA_COMMON_EXTRA_LIBS} PUBLIC llama Threads::Threads)
+target_link_libraries (${TARGET} PRIVATE ${LLAMA_COMMON_EXTRA_LIBS} PUBLIC ggml llama Threads::Threads)
diff --git a/examples/benchmark/CMakeLists.txt b/examples/benchmark/CMakeLists.txt
index 34a58cc0..358fd130 100644
--- a/examples/benchmark/CMakeLists.txt
+++ b/examples/benchmark/CMakeLists.txt
@@ -1,6 +1,6 @@
set(TARGET llama-bench-matmult)
add_executable(${TARGET} benchmark-matmult.cpp)
install(TARGETS ${TARGET} RUNTIME)
-target_link_libraries(${TARGET} PRIVATE llama build_info ${CMAKE_THREAD_LIBS_INIT})
+target_link_libraries(${TARGET} PRIVATE ggml llama build_info ${CMAKE_THREAD_LIBS_INIT})
target_include_directories(${TARGET} PRIVATE ../../common)
target_compile_features(${TARGET} PRIVATE cxx_std_11)
diff --git a/examples/gbnf-validator/CMakeLists.txt b/examples/gbnf-validator/CMakeLists.txt
index 4edd6ec7..29530835 100644
--- a/examples/gbnf-validator/CMakeLists.txt
+++ b/examples/gbnf-validator/CMakeLists.txt
@@ -1,5 +1,5 @@
set(TARGET llama-gbnf-validator)
add_executable(${TARGET} gbnf-validator.cpp)
install(TARGETS ${TARGET} RUNTIME)
-target_link_libraries(${TARGET} PRIVATE common llama ${CMAKE_THREAD_LIBS_INIT})
+target_link_libraries(${TARGET} PRIVATE common ggml llama ${CMAKE_THREAD_LIBS_INIT})
target_compile_features(${TARGET} PRIVATE cxx_std_11)
diff --git a/examples/llama-bench/CMakeLists.txt b/examples/llama-bench/CMakeLists.txt
index 5bdbea4e..eb1918c2 100644
--- a/examples/llama-bench/CMakeLists.txt
+++ b/examples/llama-bench/CMakeLists.txt
@@ -1,5 +1,5 @@
set(TARGET llama-bench)
add_executable(${TARGET} llama-bench.cpp)
install(TARGETS ${TARGET} RUNTIME)
-target_link_libraries(${TARGET} PRIVATE common llama ${CMAKE_THREAD_LIBS_INIT})
+target_link_libraries(${TARGET} PRIVATE ggml common llama ${CMAKE_THREAD_LIBS_INIT})
target_compile_features(${TARGET} PRIVATE cxx_std_11)
diff --git a/examples/quantize-stats/CMakeLists.txt b/examples/quantize-stats/CMakeLists.txt
index bb986a71..73e7f943 100644
--- a/examples/quantize-stats/CMakeLists.txt
+++ b/examples/quantize-stats/CMakeLists.txt
@@ -1,6 +1,6 @@
set(TARGET llama-quantize-stats)
add_executable(${TARGET} quantize-stats.cpp)
install(TARGETS ${TARGET} RUNTIME)
-target_link_libraries(${TARGET} PRIVATE llama build_info ${CMAKE_THREAD_LIBS_INIT})
+target_link_libraries(${TARGET} PRIVATE ggml llama build_info ${CMAKE_THREAD_LIBS_INIT})
target_include_directories(${TARGET} PRIVATE ../../common)
target_compile_features(${TARGET} PRIVATE cxx_std_11)
diff --git a/include/llama.h b/include/llama.h
index 05d8b092..d833ba4a 100644
--- a/include/llama.h
+++ b/include/llama.h
@@ -1124,23 +1124,23 @@ struct llama_grammar_candidate {
llama_partial_utf8 partial_utf8;
};
-const std::vector<std::pair<std::string, struct ggml_tensor *>> & llama_internal_get_tensor_map(
+LLAMA_API const std::vector<std::pair<std::string, struct ggml_tensor *>> & llama_internal_get_tensor_map(
struct llama_context * ctx
);
-void llama_grammar_accept(
+LLAMA_API void llama_grammar_accept(
const std::vector<std::vector<llama_grammar_element>> & rules,
const std::vector<std::vector<const llama_grammar_element *>> & stacks,
const uint32_t chr,
std::vector<std::vector<const llama_grammar_element *>> & new_stacks);
-std::pair<std::vector<uint32_t>, llama_partial_utf8> decode_utf8(
+LLAMA_API std::pair<std::vector<uint32_t>, llama_partial_utf8> decode_utf8(
const std::string & src,
llama_partial_utf8 partial_start);
// Randomly selects a token from the candidates based on their probabilities using given std::mt19937.
// This is a temporary workaround in order to fix race conditions when sampling with multiple sequences.
-llama_token llama_sample_token_with_rng(struct llama_context * ctx, llama_token_data_array * candidates, std::mt19937 & rng);
+LLAMA_API llama_token llama_sample_token_with_rng(struct llama_context * ctx, llama_token_data_array * candidates, std::mt19937 & rng);
#endif // LLAMA_API_INTERNAL
diff --git a/tests/CMakeLists.txt b/tests/CMakeLists.txt
index cfa70731..f70cdb2b 100644
--- a/tests/CMakeLists.txt
+++ b/tests/CMakeLists.txt
@@ -55,7 +55,7 @@ function(llama_target_and_test source)
add_executable(${TEST_TARGET} ${source} get-model.cpp)
install(TARGETS ${TEST_TARGET} RUNTIME)
- target_link_libraries(${TEST_TARGET} PRIVATE common)
+ target_link_libraries(${TEST_TARGET} PRIVATE ggml llama common)
add_test(
NAME ${TEST_TARGET}
WORKING_DIRECTORY ${LLAMA_TEST_WORKING_DIRECTORY}
It works with -DBUILD_SHARED_LIBS=OFF
, so I am probably misunderstanding the error.
I noticed now that CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS
is set in ggml when building on windows with shared libs. I think that the problem actually is that CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS
is not set early enough, because configuring with -DCMAKE_WINDOWS_EXPORT_ALL_SYMBOLS=ON
also works, without requiring the changes that I listed before.
This seems to work. My assumption is that CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS
needs to be set per project, unless passed through the command line, then it is applied to every subproject as well, but I am not sure if that's correct.
However, all of this seems like a hack, we go through the effort to use dllexport
in ggml.h
and llama.h
with all the symbols that should be exported, but then we just throw all of this away and use CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS
to export every symbol.
diff --git a/CMakeLists.txt b/CMakeLists.txt
index 6b9b5413..96718d75 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -6,6 +6,7 @@ include(CheckIncludeFileCXX)
set(CMAKE_WARN_UNUSED_CLI YES)
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
+set(CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS ON)
if (NOT XCODE AND NOT MSVC AND NOT CMAKE_BUILD_TYPE)
set(CMAKE_BUILD_TYPE Release CACHE STRING "Build type" FORCE)
diff --git a/common/CMakeLists.txt b/common/CMakeLists.txt
index c6fccc02..02415f2d 100644
--- a/common/CMakeLists.txt
+++ b/common/CMakeLists.txt
@@ -1,5 +1,7 @@
# common
+set(CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS ON)
+
find_package(Threads REQUIRED)
# Build info header
diff --git a/ggml/CMakeLists.txt b/ggml/CMakeLists.txt
index bdbda425..21ef7e4a 100644
--- a/ggml/CMakeLists.txt
+++ b/ggml/CMakeLists.txt
@@ -3,6 +3,7 @@ project("ggml" C CXX)
include(CheckIncludeFileCXX)
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
+set(CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS ON)
if (NOT XCODE AND NOT MSVC AND NOT CMAKE_BUILD_TYPE)
set(CMAKE_BUILD_TYPE Release CACHE STRING "Build type" FORCE)
diff --git a/ggml/src/CMakeLists.txt b/ggml/src/CMakeLists.txt
index 84bc8e19..b7c79321 100644
--- a/ggml/src/CMakeLists.txt
+++ b/ggml/src/CMakeLists.txt
@@ -825,10 +825,6 @@ endif()
if (WIN32)
add_compile_definitions(_CRT_SECURE_NO_WARNINGS)
-
- if (BUILD_SHARED_LIBS)
- set(CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS ON)
- endif()
endif()
if (GGML_LTO)
152 | 69 | ||
153 | # | ||
154 | # Compile flags | ||
155 | # | ||
156 | |||
157 | if (LLAMA_SYCL) | ||
158 | set(CMAKE_CXX_STANDARD 17) | ||
159 | else() | ||
160 | set(CMAKE_CXX_STANDARD 11) | ||
161 | endif() | ||
162 | |||
163 | set(CMAKE_CXX_STANDARD_REQUIRED true) | ||
164 | set(CMAKE_C_STANDARD 11) | ||
165 | set(CMAKE_C_STANDARD_REQUIRED true) | ||
166 | set(THREADS_PREFER_PTHREAD_FLAG ON) | ||
167 | |||
168 | find_package(Threads REQUIRED) | ||
169 | include(CheckCXXCompilerFlag) | ||
170 | |||
171 | add_compile_definitions(GGML_SCHED_MAX_COPIES=${LLAMA_SCHED_MAX_COPIES}) | ||
172 | |||
173 | # enable libstdc++ assertions for debug builds | ||
174 | if (CMAKE_SYSTEM_NAME MATCHES "Linux") | ||
175 | add_compile_definitions($<$<CONFIG:Debug>:_GLIBCXX_ASSERTIONS>) | ||
176 | endif() | ||
177 | |||
178 | if (NOT MSVC) | ||
179 | if (LLAMA_SANITIZE_THREAD) | ||
180 | add_compile_options(-fsanitize=thread) | ||
181 | link_libraries (-fsanitize=thread) | ||
182 | endif() | ||
183 | |||
184 | if (LLAMA_SANITIZE_ADDRESS) | ||
185 | add_compile_options(-fsanitize=address -fno-omit-frame-pointer) | ||
186 | link_libraries (-fsanitize=address) | ||
187 | endif() | ||
188 | |||
189 | if (LLAMA_SANITIZE_UNDEFINED) | ||
190 | add_compile_options(-fsanitize=undefined) | ||
191 | link_libraries (-fsanitize=undefined) | ||
192 | endif() | ||
193 | endif() | ||
194 | |||
195 | if (APPLE AND LLAMA_ACCELERATE) | ||
196 | find_library(ACCELERATE_FRAMEWORK Accelerate) | ||
197 | if (ACCELERATE_FRAMEWORK) | ||
198 | message(STATUS "Accelerate framework found") | ||
199 | |||
200 | add_compile_definitions(GGML_USE_ACCELERATE) | ||
201 | add_compile_definitions(ACCELERATE_NEW_LAPACK) | ||
202 | add_compile_definitions(ACCELERATE_LAPACK_ILP64) | ||
203 | set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} ${ACCELERATE_FRAMEWORK}) | ||
204 | else() | ||
205 | message(WARNING "Accelerate framework not found") | ||
206 | endif() | ||
207 | endif() | ||
208 | |||
209 | if (LLAMA_METAL) | ||
210 | find_library(FOUNDATION_LIBRARY Foundation REQUIRED) | ||
211 | find_library(METAL_FRAMEWORK Metal REQUIRED) | ||
212 | find_library(METALKIT_FRAMEWORK MetalKit REQUIRED) | ||
213 | |||
214 | message(STATUS "Metal framework found") | ||
215 | set(GGML_HEADERS_METAL ggml-metal.h) | ||
216 | set(GGML_SOURCES_METAL ggml-metal.m) | ||
217 | |||
218 | add_compile_definitions(GGML_USE_METAL) | ||
219 | if (LLAMA_METAL_NDEBUG) | ||
220 | add_compile_definitions(GGML_METAL_NDEBUG) | ||
221 | endif() | ||
222 | |||
223 | # copy ggml-common.h and ggml-metal.metal to bin directory | ||
224 | configure_file(ggml-common.h ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-common.h COPYONLY) | ||
225 | configure_file(ggml-metal.metal ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-metal.metal COPYONLY) | ||
226 | |||
227 | if (LLAMA_METAL_EMBED_LIBRARY) | ||
228 | enable_language(ASM) | ||
229 | add_compile_definitions(GGML_METAL_EMBED_LIBRARY) | ||
230 | |||
231 | set(METALLIB_COMMON "${CMAKE_CURRENT_SOURCE_DIR}/ggml-common.h") | ||
232 | set(METALLIB_SOURCE "${CMAKE_CURRENT_SOURCE_DIR}/ggml-metal.metal") | ||
233 | |||
234 | file(MAKE_DIRECTORY "${CMAKE_BINARY_DIR}/autogenerated") | ||
235 | |||
236 | # merge ggml-common.h and ggml-metal.metal into a single file | ||
237 | set(METALLIB_EMBED_ASM "${CMAKE_BINARY_DIR}/autogenerated/ggml-metal-embed.s") | ||
238 | set(METALLIB_SOURCE_EMBED "${CMAKE_BINARY_DIR}/autogenerated/ggml-metal-embed.metal") | ||
239 | |||
240 | add_custom_command( | ||
241 | OUTPUT ${METALLIB_EMBED_ASM} | ||
242 | COMMAND echo "Embedding Metal library" | ||
243 | COMMAND sed -e '/\#include \"ggml-common.h\"/r ${METALLIB_COMMON}' -e '/\#include \"ggml-common.h\"/d' < ${METALLIB_SOURCE} > ${METALLIB_SOURCE_EMBED} | ||
244 | COMMAND echo ".section __DATA,__ggml_metallib" > ${METALLIB_EMBED_ASM} | ||
245 | COMMAND echo ".globl _ggml_metallib_start" >> ${METALLIB_EMBED_ASM} | ||
246 | COMMAND echo "_ggml_metallib_start:" >> ${METALLIB_EMBED_ASM} | ||
247 | COMMAND echo ".incbin \\\"${METALLIB_SOURCE_EMBED}\\\"" >> ${METALLIB_EMBED_ASM} | ||
248 | COMMAND echo ".globl _ggml_metallib_end" >> ${METALLIB_EMBED_ASM} | ||
249 | COMMAND echo "_ggml_metallib_end:" >> ${METALLIB_EMBED_ASM} | ||
250 | DEPENDS ggml-metal.metal ggml-common.h | ||
251 | COMMENT "Generate assembly for embedded Metal library" | ||
252 | ) | ||
253 | |||
254 | set(GGML_SOURCES_METAL ${GGML_SOURCES_METAL} ${METALLIB_EMBED_ASM}) | ||
255 | else() | ||
256 | if (LLAMA_METAL_SHADER_DEBUG) | ||
257 | # custom command to do the following: | ||
258 | # xcrun -sdk macosx metal -fno-fast-math -c ggml-metal.metal -o ggml-metal.air | ||
259 | # xcrun -sdk macosx metallib ggml-metal.air -o default.metallib | ||
260 | # | ||
261 | # note: this is the only way I found to disable fast-math in Metal. it's ugly, but at least it works | ||
262 | # disabling fast math is needed in order to pass tests/test-backend-ops | ||
263 | # note: adding -fno-inline fixes the tests when using MTL_SHADER_VALIDATION=1 | ||
264 | # note: unfortunately, we have to call it default.metallib instead of ggml.metallib | ||
265 | # ref: https://github.com/ggerganov/whisper.cpp/issues/1720 | ||
266 | set(XC_FLAGS -fno-fast-math -fno-inline -g) | ||
267 | else() | ||
268 | set(XC_FLAGS -O3) | ||
269 | endif() | ||
270 | |||
271 | # Append macOS metal versioning flags | ||
272 | if (LLAMA_METAL_MACOSX_VERSION_MIN) | ||
273 | message(STATUS "Adding -mmacosx-version-min=${LLAMA_METAL_MACOSX_VERSION_MIN} flag to metal compilation") | ||
274 | list(APPEND XC_FLAGS -mmacosx-version-min=${LLAMA_METAL_MACOSX_VERSION_MIN}) | ||
275 | endif() | ||
276 | if (LLAMA_METAL_STD) | ||
277 | message(STATUS "Adding -std=${LLAMA_METAL_STD} flag to metal compilation") | ||
278 | list(APPEND XC_FLAGS -std=${LLAMA_METAL_STD}) | ||
279 | endif() | ||
280 | |||
281 | add_custom_command( | ||
282 | OUTPUT ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/default.metallib | ||
283 | COMMAND xcrun -sdk macosx metal ${XC_FLAGS} -c ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-metal.metal -o ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-metal.air | ||
284 | COMMAND xcrun -sdk macosx metallib ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-metal.air -o ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/default.metallib | ||
285 | COMMAND rm -f ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-metal.air | ||
286 | COMMAND rm -f ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-common.h | ||
287 | COMMAND rm -f ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-metal.metal | ||
288 | DEPENDS ggml-metal.metal ggml-common.h | ||
289 | COMMENT "Compiling Metal kernels" | ||
290 | ) | ||
291 | |||
292 | add_custom_target( | ||
293 | ggml-metal ALL | ||
294 | DEPENDS ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/default.metallib | ||
295 | ) | ||
296 | endif() # LLAMA_METAL_EMBED_LIBRARY | ||
297 | |||
298 | set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} | ||
299 | ${FOUNDATION_LIBRARY} | ||
300 | ${METAL_FRAMEWORK} | ||
301 | ${METALKIT_FRAMEWORK} | ||
302 | ) | ||
303 | endif() | ||
304 | |||
305 | if (LLAMA_OPENMP) | ||
306 | find_package(OpenMP) | ||
307 | if (OpenMP_FOUND) | ||
308 | message(STATUS "OpenMP found") | ||
309 | add_compile_definitions(GGML_USE_OPENMP) | ||
310 | set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} OpenMP::OpenMP_C OpenMP::OpenMP_CXX) | ||
311 | else() | ||
312 | message(WARNING "OpenMP not found") | ||
313 | endif() | ||
314 | endif() | ||
315 | |||
316 | if (LLAMA_BLAS) | ||
317 | if (LLAMA_STATIC) | ||
318 | set(BLA_STATIC ON) | ||
319 | endif() | ||
320 | #if (CMAKE_VERSION VERSION_GREATER_EQUAL 3.22) | ||
321 | # set(BLA_SIZEOF_INTEGER 8) | ||
322 | #endif() | ||
323 | |||
324 | set(BLA_VENDOR ${LLAMA_BLAS_VENDOR}) | ||
325 | find_package(BLAS) | ||
326 | |||
327 | if (BLAS_FOUND) | ||
328 | message(STATUS "BLAS found, Libraries: ${BLAS_LIBRARIES}") | ||
329 | |||
330 | if (("${BLAS_INCLUDE_DIRS}" STREQUAL "") AND NOT (${LLAMA_BLAS_VENDOR} MATCHES "Apple")) | ||
331 | # BLAS_INCLUDE_DIRS is missing in FindBLAS.cmake. | ||
332 | # see https://gitlab.kitware.com/cmake/cmake/-/issues/20268 | ||
333 | find_package(PkgConfig REQUIRED) | ||
334 | if (${LLAMA_BLAS_VENDOR} MATCHES "Generic") | ||
335 | pkg_check_modules(DepBLAS REQUIRED blas) | ||
336 | elseif (${LLAMA_BLAS_VENDOR} MATCHES "OpenBLAS") | ||
337 | # As of openblas v0.3.22, the 64-bit is named openblas64.pc | ||
338 | pkg_check_modules(DepBLAS openblas64) | ||
339 | if (NOT DepBLAS_FOUND) | ||
340 | pkg_check_modules(DepBLAS REQUIRED openblas) | ||
341 | endif() | ||
342 | elseif (${LLAMA_BLAS_VENDOR} MATCHES "FLAME") | ||
343 | pkg_check_modules(DepBLAS REQUIRED blis) | ||
344 | elseif (${LLAMA_BLAS_VENDOR} MATCHES "ATLAS") | ||
345 | pkg_check_modules(DepBLAS REQUIRED blas-atlas) | ||
346 | elseif (${LLAMA_BLAS_VENDOR} MATCHES "FlexiBLAS") | ||
347 | pkg_check_modules(DepBLAS REQUIRED flexiblas_api) | ||
348 | elseif (${LLAMA_BLAS_VENDOR} MATCHES "Intel") | ||
349 | # all Intel* libraries share the same include path | ||
350 | pkg_check_modules(DepBLAS REQUIRED mkl-sdl) | ||
351 | elseif (${LLAMA_BLAS_VENDOR} MATCHES "NVHPC") | ||
352 | # this doesn't provide pkg-config | ||
353 | # suggest to assign BLAS_INCLUDE_DIRS on your own | ||
354 | if ("${NVHPC_VERSION}" STREQUAL "") | ||
355 | message(WARNING "Better to set NVHPC_VERSION") | ||
356 | else() | ||
357 | set(DepBLAS_FOUND ON) | ||
358 | set(DepBLAS_INCLUDE_DIRS "/opt/nvidia/hpc_sdk/${CMAKE_SYSTEM_NAME}_${CMAKE_SYSTEM_PROCESSOR}/${NVHPC_VERSION}/math_libs/include") | ||
359 | endif() | ||
360 | endif() | ||
361 | if (DepBLAS_FOUND) | ||
362 | set(BLAS_INCLUDE_DIRS ${DepBLAS_INCLUDE_DIRS}) | ||
363 | else() | ||
364 | message(WARNING "BLAS_INCLUDE_DIRS neither been provided nor been automatically" | ||
365 | " detected by pkgconfig, trying to find cblas.h from possible paths...") | ||
366 | find_path(BLAS_INCLUDE_DIRS | ||
367 | NAMES cblas.h | ||
368 | HINTS | ||
369 | /usr/include | ||
370 | /usr/local/include | ||
371 | /usr/include/openblas | ||
372 | /opt/homebrew/opt/openblas/include | ||
373 | /usr/local/opt/openblas/include | ||
374 | /usr/include/x86_64-linux-gnu/openblas/include | ||
375 | ) | ||
376 | endif() | ||
377 | endif() | ||
378 | |||
379 | message(STATUS "BLAS found, Includes: ${BLAS_INCLUDE_DIRS}") | ||
380 | |||
381 | add_compile_options(${BLAS_LINKER_FLAGS}) | ||
382 | |||
383 | add_compile_definitions(GGML_USE_BLAS) | ||
384 | |||
385 | if (${BLAS_INCLUDE_DIRS} MATCHES "mkl" AND (${LLAMA_BLAS_VENDOR} MATCHES "Generic" OR ${LLAMA_BLAS_VENDOR} MATCHES "Intel")) | ||
386 | add_compile_definitions(GGML_BLAS_USE_MKL) | ||
387 | endif() | ||
388 | |||
389 | set(GGML_HEADERS_BLAS ggml-blas.h) | ||
390 | set(GGML_SOURCES_BLAS ggml-blas.cpp) | ||
391 | |||
392 | set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} ${BLAS_LIBRARIES}) | ||
393 | set(LLAMA_EXTRA_INCLUDES ${LLAMA_EXTRA_INCLUDES} ${BLAS_INCLUDE_DIRS}) | ||
394 | else() | ||
395 | message(WARNING "BLAS not found, please refer to " | ||
396 | "https://cmake.org/cmake/help/latest/module/FindBLAS.html#blas-lapack-vendors" | ||
397 | " to set correct LLAMA_BLAS_VENDOR") | ||
70 | # override ggml options | ||
71 | set(GGML_CCACHE ${LLAMA_CCACHE}) | ||
72 | set(GGML_BUILD_SHARED_LIBS ${LLAMA_BUILD_SHARED_LIBS}) |
GGML_BUILD_SHARED_LIBS
and LLAMA_BUILD_SHARED_LIBS
do not exist, BUILD_SHARED_LIBS
is used directly.
if you are reorganizing source codes, I would like to suggest moving some compiled tools from the example folder into a dedicated folder. It's not quite a script, but in my opinion its not examples as well... considering it's more for maintainers internal usage
The PRs are ready for review. The CI in llama.cpp
is still failing for some of the workflows:
enable_language(HIP)
inside ggml/src/CMakeLists.txt
instead of the root CMakeLists.txt
. This is causing the undefined CMAKE_HIP_
variables later on: https://github.com/ggerganov/llama.cpp/actions/runs/9657886762/job/26637996149?pr=8006. Not sure how to fixmaster
?kompute
build is failing - @cebtenzzre could you assist with this? Should be a simple fix, but wasn't able to setup the kompute
build on my machinewhisper.cpp clang linux/arm/v7
workflow no longer works, though it looks like a clang
problem: https://github.com/ggerganov/whisper.cpp/actions/runs/9657764109/job/26637633042?pr=2256#step:4:1990There are quite some things to improve upon this, but probably it is better to merge first and resolve from master
.
- I think HIPBLAS is failing because with
enable_language(HIP)
insideggml/src/CMakeLists.txt
instead of the rootCMakeLists.txt
. This is causing the undefinedCMAKE_HIP_
variables later on: https://github.com/ggerganov/llama.cpp/actions/runs/9657886762/job/26637996149?pr=8006. Not sure how to fix
This may be related to this: https://cmake.org/cmake/help/latest/policy/CMP0165.html#policy:CMP0165
It may also affect CUDA. I get these warnings:
Make Warning (dev) in src/CMakeLists.txt:
Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
empty CUDA_ARCHITECTURES not allowed. Run "cmake --help-policy CMP0104"
for policy details. Use the cmake_policy command to set the policy and
suppress this warning.
CUDA_ARCHITECTURES is empty for target "llama".
This warning is for project developers. Use -Wno-dev to suppress it.
CMake Warning (dev) in examples/gguf/CMakeLists.txt:
Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
empty CUDA_ARCHITECTURES not allowed. Run "cmake --help-policy CMP0104"
for policy details. Use the cmake_policy command to set the policy and
suppress this warning.
CUDA_ARCHITECTURES is empty for target "llama-gguf".
This warning is for project developers. Use -Wno-dev to suppress it.
CMake Warning (dev) in examples/llava/CMakeLists.txt:
Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
empty CUDA_ARCHITECTURES not allowed. Run "cmake --help-policy CMP0104"
for policy details. Use the cmake_policy command to set the policy and
suppress this warning.
CUDA_ARCHITECTURES is empty for target "llava_shared".
This warning is for project developers. Use -Wno-dev to suppress it.
I am not sure what is supposed to be the solution in this case. I think that the source of the problem is using multiple CMakeLists.txt
for the same target. Maybe this could be fixed by using include
instead of add_subdirectory
.
Probably the simplest solution would be to remove the src/CMakeLists.txt
entirely and merge them into the base CMakeList.txt
. I think that the expectation is that add_subdirectory
is only used when a project has multiple targets.
Each backend could be moved to a different target. This will need to be done when making the backends dynamically loadable in the future.
Probably the simplest solution would be to remove the src/CMakeLists.txt entirely and merge them into the base CMakeList.txt.
We still need to add_subdirectory(ggml)
in the root llama.cpp/CMakeLists.txt
- wouldn't it hit the same issue?
I've resolved these warnings in whisper.cpp
by enabling CUDA in whisper.cpp/src/CMakeLists.txt
: https://github.com/ggerganov/whisper.cpp/blob/f3c609bd2e69b40f7500af8e13c1db270f997403/src/CMakeLists.txt#L80-L101
We can do the same in llama.cpp/src/CMakeLists.txt
, but for HIP it seems more complicated because it checks for some compiler compatibility
We still need to
add_subdirectory(ggml)
in the rootllama.cpp/CMakeLists.txt
- wouldn't it hit the same issue?
My expectation was that ggml would be built either as a static or a shared library, and then linked to llama.cpp. However this does not seem to be happening because the ggml
target is created as an "object" library, which seems to mean that it does not behave as a library at all, and it is more like adding the source files to the parent project. Changing this fixes the CUDA warnings for me.
index 4f146026..f7ecdc2a 100644
--- a/ggml/src/CMakeLists.txt
+++ b/ggml/src/CMakeLists.txt
@@ -1129,7 +1129,7 @@ endif()
# ggml
-add_library(ggml OBJECT
+add_library(ggml
../include/ggml.h
../include/ggml-alloc.h
../include/ggml-backend.h
Is there any reason to use an OBJECT
library?
Ah, it wasn't intended to be an object library - I just didn't notice. Nice catch
There are still some references to LLAMA_CUDA
in the Makefile
:
diff --git a/Makefile b/Makefile
index 88084cac..80e7678d 100644
--- a/Makefile
+++ b/Makefile
@@ -859,7 +859,7 @@ override NVCCFLAGS := $(MK_NVCCFLAGS) $(NVCCFLAGS)
override LDFLAGS := $(MK_LDFLAGS) $(LDFLAGS)
# identify CUDA host compiler
-ifdef LLAMA_CUDA
+ifdef GGML_CUDA
GF_CC := $(NVCC) $(NVCCFLAGS) 2>/dev/null .c -Xcompiler
include scripts/get-flags.mk
CUDA_CXXFLAGS := $(BASE_CXXFLAGS) $(GF_CXXFLAGS) -Wno-pedantic
@@ -884,7 +884,7 @@ $(info I NVCCFLAGS: $(NVCCFLAGS))
$(info I LDFLAGS: $(LDFLAGS))
$(info I CC: $(shell $(CC) --version | head -n 1))
$(info I CXX: $(shell $(CXX) --version | head -n 1))
-ifdef LLAMA_CUDA
+ifdef GGML_CUDA
$(info I NVCC: $(shell $(NVCC) --version | tail -n 1))
CUDA_VERSION := $(shell $(NVCC) --version | grep -oP 'release (\K[0-9]+\.[0-9])')
ifeq ($(shell awk -v "v=$(CUDA_VERSION)" 'BEGIN { print (v < 11.7) }'),1)
@@ -896,7 +896,7 @@ endif # CUDA_POWER_ARCH
endif # CUDA_DOCKER_ARCH
endif # eq ($(shell echo "$(CUDA_VERSION) < 11.7" | bc),1)
-endif # LLAMA_CUDA
+endif # GGML_CUDA
$(info )
ifdef DEPRECATE_WARNING
@slaren The LLAMA_CUDA_FORCE_CUBLAS
cmake option got mistakenly removed but is still used. I believe it's intended to be mutually exclusive with GGML_CUDA_FORCE_MMQ
so some changes might be needed (cmake, mmq.cu, ggml-cuda.cu ..)
There is now GGML_CUDA_FORCE_CUBLAS
Edit: nvm #8140
Login to write a write a comment.
ref #7573, #6913
Adopt a new structure of the source code that makes
ggml
source code and build scripts easier to reuse.Note that the build options that are relevant to the
ggml
library are now prefixed withGGML_
. For example,LLAMA_CUDA
andWHISPER_METAL
are nowGGML_CUDA
andGGML_METAL
. However,WHISPER_COREML
andWHISPER_OPENVINO
are still named the same because the CoreML and OpenVINO functionality inwhisper.cpp
is not part of theggml
library.The
Makefiles
inllama.cpp
andwhisper.cpp
have been updated to be more similar with each other.Header files (such as
ggml.h
,llama.h
andwhisper.h
) are now placed ininclude
subfolders, while the source files are placed insrc
.PRs in other projects that will be updated and merged together with this one:
TODOs:
ggml
repowhisper.cpp
repollama.cpp
whisper.cpp
llama.cpp
ggml
whisper.cpp
ggml/cmake
master
?)GGML_
optionsllama.cpp
whisper.cpp
Changes:
LLAMA_XXX
in favor ofGGML_XXX
Resolved PRs:
TODO in follow-up PRs:
tests
toggml/tests
CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS
and build and link shared libraries properlyggml
folder level