PR #8006 llama : reorganize source code + improve CMake

ggerganov329 days ago (edited 322 days ago)👍 6

Adopt a new structure of the source code that makes ggml source code and build scripts easier to reuse.

Note that the build options that are relevant to the ggml library are now prefixed with GGML_. For example, LLAMA_CUDA and WHISPER_METAL are now GGML_CUDA and GGML_METAL. However, WHISPER_COREML and WHISPER_OPENVINO are still named the same because the CoreML and OpenVINO functionality in whisper.cpp is not part of the ggml library.

The Makefiles in llama.cpp and whisper.cpp have been updated to be more similar with each other.

Header files (such as ggml.h, llama.h and whisper.h) are now placed in include subfolders, while the source files are placed in src.

PRs in other projects that will be updated and merged together with this one:

TODOs:

Changes:

deprecate LLAMA_XXX in favor of GGML_XXX

Resolved PRs:

TODO in follow-up PRs:

move relevant tests from tests to ggml/tests
avoid using CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS and build and link shared libraries properly
link shared libs to examples by default
simplify sync scripts to work on the ggml folder level

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

github-actions added build

github-actions added python

ggerganov force pushed from 729c7ccb to cde2512a 329 days ago

mofosyne added Review Complexity : Medium

ggerganov force pushed from 62d26cce to 3ab24ad6 329 days ago

ggerganov force pushed from 3ab24ad6 to b1064a77 328 days ago

ggerganov force pushed from 13a97767 to ad6d310d 328 days ago

ggerganov force pushed from ad6d310d to c79bb52d 328 days ago

ggerganov force pushed from c79bb52d to b6a1addd 328 days ago

ggerganov force pushed from 3f4a147e to faaaeffe 328 days ago

ggerganov force pushed from faaaeffe to b52fde17 328 days ago

ggerganov force pushed from c18326af to 5c5e4d40 328 days ago

ggerganov force pushed from 5c5e4d40 to ace2b97c 328 days ago

ggerganov force pushed from ace2b97c to 8216c4c3 328 days ago

github-actions added script

ggerganov force pushed from 8216c4c3 to 5b1490a6 328 days ago

github-actions added devops

ggerganov force pushed from c9a3dc85 to 73d5f903 327 days ago

github-actions added documentation

github-actions added nix

github-actions added examples

github-actions added SYCL

ggerganov327 days ago❤ 1

The llama.cpp reorganization should be now mostly ready and can be tested. The ggml code, tests and cmake scripts will be shared across all 3 repos (as if it is a git submodule).

Some CI workflows are still failing - any help with resolving these is appreciated. I'll now focus on updating whisper.cpp to reuse ggml in a similar way and then look to merge this sometime next week. Extra attention to the new build options (e.g. LLAMA_CUDA is now GGML_CUDA)

ggerganov force pushed from e550e3e0 to 7c1c378c 327 days ago

slaren327 days ago (edited 327 days ago)

The windows build is failing because some targets do not link to the libraries they need. On linux that works because in the end all the symbols are put together indiscriminately, but not on windows.

This solves some of the problems, but some tests depend on internal functions such as unicode_cpt_to_utf8 of llama.cpp that are not exported. But I am not sure if this a real solution, because with static builds I think this is going to cause multiple versions of ggml and llama to be linked in some binaries.

diff --git a/common/CMakeLists.txt b/common/CMakeLists.txt
index c6fccc02..299d53c3 100644
--- a/common/CMakeLists.txt
+++ b/common/CMakeLists.txt
@@ -85,4 +85,4 @@ endif ()

 target_include_directories(${TARGET} PUBLIC .)
 target_compile_features   (${TARGET} PUBLIC cxx_std_11)
-target_link_libraries     (${TARGET} PRIVATE ${LLAMA_COMMON_EXTRA_LIBS} PUBLIC llama Threads::Threads)
+target_link_libraries     (${TARGET} PRIVATE ${LLAMA_COMMON_EXTRA_LIBS} PUBLIC ggml llama Threads::Threads)
diff --git a/examples/benchmark/CMakeLists.txt b/examples/benchmark/CMakeLists.txt
index 34a58cc0..358fd130 100644
--- a/examples/benchmark/CMakeLists.txt
+++ b/examples/benchmark/CMakeLists.txt
@@ -1,6 +1,6 @@
 set(TARGET llama-bench-matmult)
 add_executable(${TARGET} benchmark-matmult.cpp)
 install(TARGETS ${TARGET} RUNTIME)
-target_link_libraries(${TARGET} PRIVATE llama build_info ${CMAKE_THREAD_LIBS_INIT})
+target_link_libraries(${TARGET} PRIVATE ggml llama build_info ${CMAKE_THREAD_LIBS_INIT})
 target_include_directories(${TARGET} PRIVATE ../../common)
 target_compile_features(${TARGET} PRIVATE cxx_std_11)
diff --git a/examples/gbnf-validator/CMakeLists.txt b/examples/gbnf-validator/CMakeLists.txt
index 4edd6ec7..29530835 100644
--- a/examples/gbnf-validator/CMakeLists.txt
+++ b/examples/gbnf-validator/CMakeLists.txt
@@ -1,5 +1,5 @@
 set(TARGET llama-gbnf-validator)
 add_executable(${TARGET} gbnf-validator.cpp)
 install(TARGETS ${TARGET} RUNTIME)
-target_link_libraries(${TARGET} PRIVATE common llama ${CMAKE_THREAD_LIBS_INIT})
+target_link_libraries(${TARGET} PRIVATE common ggml llama ${CMAKE_THREAD_LIBS_INIT})
 target_compile_features(${TARGET} PRIVATE cxx_std_11)
diff --git a/examples/llama-bench/CMakeLists.txt b/examples/llama-bench/CMakeLists.txt
index 5bdbea4e..eb1918c2 100644
--- a/examples/llama-bench/CMakeLists.txt
+++ b/examples/llama-bench/CMakeLists.txt
@@ -1,5 +1,5 @@
 set(TARGET llama-bench)
 add_executable(${TARGET} llama-bench.cpp)
 install(TARGETS ${TARGET} RUNTIME)
-target_link_libraries(${TARGET} PRIVATE common llama ${CMAKE_THREAD_LIBS_INIT})
+target_link_libraries(${TARGET} PRIVATE ggml common llama ${CMAKE_THREAD_LIBS_INIT})
 target_compile_features(${TARGET} PRIVATE cxx_std_11)
diff --git a/examples/quantize-stats/CMakeLists.txt b/examples/quantize-stats/CMakeLists.txt
index bb986a71..73e7f943 100644
--- a/examples/quantize-stats/CMakeLists.txt
+++ b/examples/quantize-stats/CMakeLists.txt
@@ -1,6 +1,6 @@
 set(TARGET llama-quantize-stats)
 add_executable(${TARGET} quantize-stats.cpp)
 install(TARGETS ${TARGET} RUNTIME)
-target_link_libraries(${TARGET} PRIVATE llama build_info ${CMAKE_THREAD_LIBS_INIT})
+target_link_libraries(${TARGET} PRIVATE ggml llama build_info ${CMAKE_THREAD_LIBS_INIT})
 target_include_directories(${TARGET} PRIVATE ../../common)
 target_compile_features(${TARGET} PRIVATE cxx_std_11)
diff --git a/include/llama.h b/include/llama.h
index 05d8b092..d833ba4a 100644
--- a/include/llama.h
+++ b/include/llama.h
@@ -1124,23 +1124,23 @@ struct llama_grammar_candidate {
     llama_partial_utf8   partial_utf8;
 };

-const std::vector<std::pair<std::string, struct ggml_tensor *>> & llama_internal_get_tensor_map(
+LLAMA_API const std::vector<std::pair<std::string, struct ggml_tensor *>> & llama_internal_get_tensor_map(
     struct llama_context * ctx
 );

-void llama_grammar_accept(
+LLAMA_API void llama_grammar_accept(
         const std::vector<std::vector<llama_grammar_element>>         & rules,
         const std::vector<std::vector<const llama_grammar_element *>> & stacks,
         const uint32_t                                                  chr,
         std::vector<std::vector<const llama_grammar_element *>>       & new_stacks);

-std::pair<std::vector<uint32_t>, llama_partial_utf8> decode_utf8(
+LLAMA_API std::pair<std::vector<uint32_t>, llama_partial_utf8> decode_utf8(
         const std::string & src,
         llama_partial_utf8   partial_start);

 // Randomly selects a token from the candidates based on their probabilities using given std::mt19937.
 // This is a temporary workaround in order to fix race conditions when sampling with multiple sequences.
-llama_token llama_sample_token_with_rng(struct llama_context * ctx, llama_token_data_array * candidates, std::mt19937 & rng);
+LLAMA_API llama_token llama_sample_token_with_rng(struct llama_context * ctx, llama_token_data_array * candidates, std::mt19937 & rng);

 #endif // LLAMA_API_INTERNAL

diff --git a/tests/CMakeLists.txt b/tests/CMakeLists.txt
index cfa70731..f70cdb2b 100644
--- a/tests/CMakeLists.txt
+++ b/tests/CMakeLists.txt
@@ -55,7 +55,7 @@ function(llama_target_and_test source)

     add_executable(${TEST_TARGET} ${source} get-model.cpp)
     install(TARGETS ${TEST_TARGET} RUNTIME)
-    target_link_libraries(${TEST_TARGET} PRIVATE common)
+    target_link_libraries(${TEST_TARGET} PRIVATE ggml llama common)
     add_test(
         NAME ${TEST_TARGET}
         WORKING_DIRECTORY ${LLAMA_TEST_WORKING_DIRECTORY}

slaren327 days ago

It works with -DBUILD_SHARED_LIBS=OFF, so I am probably misunderstanding the error.

slaren327 days ago (edited 327 days ago)

I noticed now that CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS is set in ggml when building on windows with shared libs. I think that the problem actually is that CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS is not set early enough, because configuring with -DCMAKE_WINDOWS_EXPORT_ALL_SYMBOLS=ON also works, without requiring the changes that I listed before.

This seems to work. My assumption is that CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS needs to be set per project, unless passed through the command line, then it is applied to every subproject as well, but I am not sure if that's correct.

However, all of this seems like a hack, we go through the effort to use dllexport in ggml.h and llama.h with all the symbols that should be exported, but then we just throw all of this away and use CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS to export every symbol.

diff --git a/CMakeLists.txt b/CMakeLists.txt
index 6b9b5413..96718d75 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -6,6 +6,7 @@ include(CheckIncludeFileCXX)
 set(CMAKE_WARN_UNUSED_CLI YES)

 set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
+set(CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS ON)

 if (NOT XCODE AND NOT MSVC AND NOT CMAKE_BUILD_TYPE)
     set(CMAKE_BUILD_TYPE Release CACHE STRING "Build type" FORCE)
diff --git a/common/CMakeLists.txt b/common/CMakeLists.txt
index c6fccc02..02415f2d 100644
--- a/common/CMakeLists.txt
+++ b/common/CMakeLists.txt
@@ -1,5 +1,7 @@
 # common

+set(CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS ON)
+
 find_package(Threads REQUIRED)

 # Build info header
diff --git a/ggml/CMakeLists.txt b/ggml/CMakeLists.txt
index bdbda425..21ef7e4a 100644
--- a/ggml/CMakeLists.txt
+++ b/ggml/CMakeLists.txt
@@ -3,6 +3,7 @@ project("ggml" C CXX)
 include(CheckIncludeFileCXX)

 set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
+set(CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS ON)

 if (NOT XCODE AND NOT MSVC AND NOT CMAKE_BUILD_TYPE)
     set(CMAKE_BUILD_TYPE Release CACHE STRING "Build type" FORCE)
diff --git a/ggml/src/CMakeLists.txt b/ggml/src/CMakeLists.txt
index 84bc8e19..b7c79321 100644
--- a/ggml/src/CMakeLists.txt
+++ b/ggml/src/CMakeLists.txt
@@ -825,10 +825,6 @@ endif()

 if (WIN32)
     add_compile_definitions(_CRT_SECURE_NO_WARNINGS)
-
-    if (BUILD_SHARED_LIBS)
-        set(CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS ON)
-    endif()
 endif()

 if (GGML_LTO)

slaren commented on 2024-06-21

CMakeLists.txt

152	69
153		#
154		# Compile flags
155		#
156
157		if (LLAMA_SYCL)
158		set(CMAKE_CXX_STANDARD 17)
159		else()
160		set(CMAKE_CXX_STANDARD 11)
161		endif()
162
163		set(CMAKE_CXX_STANDARD_REQUIRED true)
164		set(CMAKE_C_STANDARD 11)
165		set(CMAKE_C_STANDARD_REQUIRED true)
166		set(THREADS_PREFER_PTHREAD_FLAG ON)
167
168		find_package(Threads REQUIRED)
169		include(CheckCXXCompilerFlag)
170
171		add_compile_definitions(GGML_SCHED_MAX_COPIES=${LLAMA_SCHED_MAX_COPIES})
172
173		# enable libstdc++ assertions for debug builds
174		if (CMAKE_SYSTEM_NAME MATCHES "Linux")
175		add_compile_definitions($<$<CONFIG:Debug>:_GLIBCXX_ASSERTIONS>)
176		endif()
177
178		if (NOT MSVC)
179		if (LLAMA_SANITIZE_THREAD)
180		add_compile_options(-fsanitize=thread)
181		link_libraries (-fsanitize=thread)
182		endif()
183
184		if (LLAMA_SANITIZE_ADDRESS)
185		add_compile_options(-fsanitize=address -fno-omit-frame-pointer)
186		link_libraries (-fsanitize=address)
187		endif()
188
189		if (LLAMA_SANITIZE_UNDEFINED)
190		add_compile_options(-fsanitize=undefined)
191		link_libraries (-fsanitize=undefined)
192		endif()
193		endif()
194
195		if (APPLE AND LLAMA_ACCELERATE)
196		find_library(ACCELERATE_FRAMEWORK Accelerate)
197		if (ACCELERATE_FRAMEWORK)
198		message(STATUS "Accelerate framework found")
199
200		add_compile_definitions(GGML_USE_ACCELERATE)
201		add_compile_definitions(ACCELERATE_NEW_LAPACK)
202		add_compile_definitions(ACCELERATE_LAPACK_ILP64)
203		set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} ${ACCELERATE_FRAMEWORK})
204		else()
205		message(WARNING "Accelerate framework not found")
206		endif()
207		endif()
208
209		if (LLAMA_METAL)
210		find_library(FOUNDATION_LIBRARY Foundation REQUIRED)
211		find_library(METAL_FRAMEWORK Metal REQUIRED)
212		find_library(METALKIT_FRAMEWORK MetalKit REQUIRED)
213
214		message(STATUS "Metal framework found")
215		set(GGML_HEADERS_METAL ggml-metal.h)
216		set(GGML_SOURCES_METAL ggml-metal.m)
217
218		add_compile_definitions(GGML_USE_METAL)
219		if (LLAMA_METAL_NDEBUG)
220		add_compile_definitions(GGML_METAL_NDEBUG)
221		endif()
222
223		# copy ggml-common.h and ggml-metal.metal to bin directory
224		configure_file(ggml-common.h ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-common.h COPYONLY)
225		configure_file(ggml-metal.metal ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-metal.metal COPYONLY)
226
227		if (LLAMA_METAL_EMBED_LIBRARY)
228		enable_language(ASM)
229		add_compile_definitions(GGML_METAL_EMBED_LIBRARY)
230
231		set(METALLIB_COMMON "${CMAKE_CURRENT_SOURCE_DIR}/ggml-common.h")
232		set(METALLIB_SOURCE "${CMAKE_CURRENT_SOURCE_DIR}/ggml-metal.metal")
233
234		file(MAKE_DIRECTORY "${CMAKE_BINARY_DIR}/autogenerated")
235
236		# merge ggml-common.h and ggml-metal.metal into a single file
237		set(METALLIB_EMBED_ASM "${CMAKE_BINARY_DIR}/autogenerated/ggml-metal-embed.s")
238		set(METALLIB_SOURCE_EMBED "${CMAKE_BINARY_DIR}/autogenerated/ggml-metal-embed.metal")
239
240		add_custom_command(
241		OUTPUT ${METALLIB_EMBED_ASM}
242		COMMAND echo "Embedding Metal library"
243		COMMAND sed -e '/\#include \"ggml-common.h\"/r ${METALLIB_COMMON}' -e '/\#include \"ggml-common.h\"/d' < ${METALLIB_SOURCE} > ${METALLIB_SOURCE_EMBED}
244		COMMAND echo ".section __DATA,__ggml_metallib" > ${METALLIB_EMBED_ASM}
245		COMMAND echo ".globl _ggml_metallib_start" >> ${METALLIB_EMBED_ASM}
246		COMMAND echo "_ggml_metallib_start:" >> ${METALLIB_EMBED_ASM}
247		COMMAND echo ".incbin \\\"${METALLIB_SOURCE_EMBED}\\\"" >> ${METALLIB_EMBED_ASM}
248		COMMAND echo ".globl _ggml_metallib_end" >> ${METALLIB_EMBED_ASM}
249		COMMAND echo "_ggml_metallib_end:" >> ${METALLIB_EMBED_ASM}
250		DEPENDS ggml-metal.metal ggml-common.h
251		COMMENT "Generate assembly for embedded Metal library"
252		)
253
254		set(GGML_SOURCES_METAL ${GGML_SOURCES_METAL} ${METALLIB_EMBED_ASM})
255		else()
256		if (LLAMA_METAL_SHADER_DEBUG)
257		# custom command to do the following:
258		# xcrun -sdk macosx metal -fno-fast-math -c ggml-metal.metal -o ggml-metal.air
259		# xcrun -sdk macosx metallib ggml-metal.air -o default.metallib
260		#
261		# note: this is the only way I found to disable fast-math in Metal. it's ugly, but at least it works
262		# disabling fast math is needed in order to pass tests/test-backend-ops
263		# note: adding -fno-inline fixes the tests when using MTL_SHADER_VALIDATION=1
264		# note: unfortunately, we have to call it default.metallib instead of ggml.metallib
265		# ref: https://github.com/ggerganov/whisper.cpp/issues/1720
266		set(XC_FLAGS -fno-fast-math -fno-inline -g)
267		else()
268		set(XC_FLAGS -O3)
269		endif()
270
271		# Append macOS metal versioning flags
272		if (LLAMA_METAL_MACOSX_VERSION_MIN)
273		message(STATUS "Adding -mmacosx-version-min=${LLAMA_METAL_MACOSX_VERSION_MIN} flag to metal compilation")
274		list(APPEND XC_FLAGS -mmacosx-version-min=${LLAMA_METAL_MACOSX_VERSION_MIN})
275		endif()
276		if (LLAMA_METAL_STD)
277		message(STATUS "Adding -std=${LLAMA_METAL_STD} flag to metal compilation")
278		list(APPEND XC_FLAGS -std=${LLAMA_METAL_STD})
279		endif()
280
281		add_custom_command(
282		OUTPUT ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/default.metallib
283		COMMAND xcrun -sdk macosx metal ${XC_FLAGS} -c ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-metal.metal -o ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-metal.air
284		COMMAND xcrun -sdk macosx metallib ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-metal.air -o ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/default.metallib
285		COMMAND rm -f ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-metal.air
286		COMMAND rm -f ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-common.h
287		COMMAND rm -f ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/ggml-metal.metal
288		DEPENDS ggml-metal.metal ggml-common.h
289		COMMENT "Compiling Metal kernels"
290		)
291
292		add_custom_target(
293		ggml-metal ALL
294		DEPENDS ${CMAKE_RUNTIME_OUTPUT_DIRECTORY}/default.metallib
295		)
296		endif() # LLAMA_METAL_EMBED_LIBRARY
297
298		set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS}
299		${FOUNDATION_LIBRARY}
300		${METAL_FRAMEWORK}
301		${METALKIT_FRAMEWORK}
302		)
303		endif()
304
305		if (LLAMA_OPENMP)
306		find_package(OpenMP)
307		if (OpenMP_FOUND)
308		message(STATUS "OpenMP found")
309		add_compile_definitions(GGML_USE_OPENMP)
310		set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} OpenMP::OpenMP_C OpenMP::OpenMP_CXX)
311		else()
312		message(WARNING "OpenMP not found")
313		endif()
314		endif()
315
316		if (LLAMA_BLAS)
317		if (LLAMA_STATIC)
318		set(BLA_STATIC ON)
319		endif()
320		#if (CMAKE_VERSION VERSION_GREATER_EQUAL 3.22)
321		# set(BLA_SIZEOF_INTEGER 8)
322		#endif()
323
324		set(BLA_VENDOR ${LLAMA_BLAS_VENDOR})
325		find_package(BLAS)
326
327		if (BLAS_FOUND)
328		message(STATUS "BLAS found, Libraries: ${BLAS_LIBRARIES}")
329
330		if (("${BLAS_INCLUDE_DIRS}" STREQUAL "") AND NOT (${LLAMA_BLAS_VENDOR} MATCHES "Apple"))
331		# BLAS_INCLUDE_DIRS is missing in FindBLAS.cmake.
332		# see https://gitlab.kitware.com/cmake/cmake/-/issues/20268
333		find_package(PkgConfig REQUIRED)
334		if (${LLAMA_BLAS_VENDOR} MATCHES "Generic")
335		pkg_check_modules(DepBLAS REQUIRED blas)
336		elseif (${LLAMA_BLAS_VENDOR} MATCHES "OpenBLAS")
337		# As of openblas v0.3.22, the 64-bit is named openblas64.pc
338		pkg_check_modules(DepBLAS openblas64)
339		if (NOT DepBLAS_FOUND)
340		pkg_check_modules(DepBLAS REQUIRED openblas)
341		endif()
342		elseif (${LLAMA_BLAS_VENDOR} MATCHES "FLAME")
343		pkg_check_modules(DepBLAS REQUIRED blis)
344		elseif (${LLAMA_BLAS_VENDOR} MATCHES "ATLAS")
345		pkg_check_modules(DepBLAS REQUIRED blas-atlas)
346		elseif (${LLAMA_BLAS_VENDOR} MATCHES "FlexiBLAS")
347		pkg_check_modules(DepBLAS REQUIRED flexiblas_api)
348		elseif (${LLAMA_BLAS_VENDOR} MATCHES "Intel")
349		# all Intel* libraries share the same include path
350		pkg_check_modules(DepBLAS REQUIRED mkl-sdl)
351		elseif (${LLAMA_BLAS_VENDOR} MATCHES "NVHPC")
352		# this doesn't provide pkg-config
353		# suggest to assign BLAS_INCLUDE_DIRS on your own
354		if ("${NVHPC_VERSION}" STREQUAL "")
355		message(WARNING "Better to set NVHPC_VERSION")
356		else()
357		set(DepBLAS_FOUND ON)
358		set(DepBLAS_INCLUDE_DIRS "/opt/nvidia/hpc_sdk/${CMAKE_SYSTEM_NAME}_${CMAKE_SYSTEM_PROCESSOR}/${NVHPC_VERSION}/math_libs/include")
359		endif()
360		endif()
361		if (DepBLAS_FOUND)
362		set(BLAS_INCLUDE_DIRS ${DepBLAS_INCLUDE_DIRS})
363		else()
364		message(WARNING "BLAS_INCLUDE_DIRS neither been provided nor been automatically"
365		" detected by pkgconfig, trying to find cblas.h from possible paths...")
366		find_path(BLAS_INCLUDE_DIRS
367		NAMES cblas.h
368		HINTS
369		/usr/include
370		/usr/local/include
371		/usr/include/openblas
372		/opt/homebrew/opt/openblas/include
373		/usr/local/opt/openblas/include
374		/usr/include/x86_64-linux-gnu/openblas/include
375		)
376		endif()
377		endif()
378
379		message(STATUS "BLAS found, Includes: ${BLAS_INCLUDE_DIRS}")
380
381		add_compile_options(${BLAS_LINKER_FLAGS})
382
383		add_compile_definitions(GGML_USE_BLAS)
384
385		if (${BLAS_INCLUDE_DIRS} MATCHES "mkl" AND (${LLAMA_BLAS_VENDOR} MATCHES "Generic" OR ${LLAMA_BLAS_VENDOR} MATCHES "Intel"))
386		add_compile_definitions(GGML_BLAS_USE_MKL)
387		endif()
388
389		set(GGML_HEADERS_BLAS ggml-blas.h)
390		set(GGML_SOURCES_BLAS ggml-blas.cpp)
391
392		set(LLAMA_EXTRA_LIBS ${LLAMA_EXTRA_LIBS} ${BLAS_LIBRARIES})
393		set(LLAMA_EXTRA_INCLUDES ${LLAMA_EXTRA_INCLUDES} ${BLAS_INCLUDE_DIRS})
394		else()
395		message(WARNING "BLAS not found, please refer to "
396		"https://cmake.org/cmake/help/latest/module/FindBLAS.html#blas-lapack-vendors"
397		" to set correct LLAMA_BLAS_VENDOR")
	70	# override ggml options
	71	set(GGML_CCACHE ${LLAMA_CCACHE})
	72	set(GGML_BUILD_SHARED_LIBS ${LLAMA_BUILD_SHARED_LIBS})

slaren327 days ago (edited 327 days ago)👍 1

GGML_BUILD_SHARED_LIBS and LLAMA_BUILD_SHARED_LIBS do not exist, BUILD_SHARED_LIBS is used directly.

mofosyne327 days ago👍 2

if you are reorganizing source codes, I would like to suggest moving some compiled tools from the example folder into a dedicated folder. It's not quite a script, but in my opinion its not examples as well... considering it's more for maintainers internal usage

ggerganov force pushed from 7c1c378c to 622fc9b5 326 days ago

ggerganov force pushed from 622fc9b5 to c23c6983 325 days ago

ggerganov force pushed from c23c6983 to e6faa61e 325 days ago

ggerganov force pushed from 0a977ffc to 0938c55b 325 days ago

ggerganov force pushed from d011b9ff to 5b218911 324 days ago

ggerganov force pushed from 5b218911 to bddb4931 324 days ago

ggerganov force pushed from bddb4931 to 5e376582 324 days ago

ggerganov force pushed from c57cbf5a to 1f859b8c 324 days ago

ggerganov force pushed from caf50ea9 to b192bed3 324 days ago

ggerganov force pushed from b192bed3 to 417de6b9 324 days ago

github-actions added server

ggerganov marked this pull request as ready for review 324 days ago

ggerganov324 days ago (edited 323 days ago)👍 1

The PRs are ready for review. The CI in llama.cpp is still failing for some of the workflows:

I think HIPBLAS is failing because with enable_language(HIP) inside ggml/src/CMakeLists.txt instead of the root CMakeLists.txt. This is causing the undefined CMAKE_HIP_ variables later on: https://github.com/ggerganov/llama.cpp/actions/runs/9657886762/job/26637996149?pr=8006. Not sure how to fix
Hopefully the Nix workflows will work after merge because I think they are pulling stuff from master?
~~Not sure why the kompute build is failing - @cebtenzzre could you assist with this? Should be a simple fix, but wasn't able to setup the kompute build on my machine~~
The whisper.cpp clang linux/arm/v7 workflow no longer works, though it looks like a clang problem: https://github.com/ggerganov/whisper.cpp/actions/runs/9657764109/job/26637633042?pr=2256#step:4:1990

There are quite some things to improve upon this, but probably it is better to merge first and resolve from master.

ggerganov requested a review from

slaren 324 days ago

slaren323 days ago

I think HIPBLAS is failing because with enable_language(HIP) inside ggml/src/CMakeLists.txt instead of the root CMakeLists.txt. This is causing the undefined CMAKE_HIP_ variables later on: https://github.com/ggerganov/llama.cpp/actions/runs/9657886762/job/26637996149?pr=8006. Not sure how to fix

It may also affect CUDA. I get these warnings:

Make Warning (dev) in src/CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "llama".
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) in examples/gguf/CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "llama-gguf".
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) in examples/llava/CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "llava_shared".
This warning is for project developers.  Use -Wno-dev to suppress it.

I am not sure what is supposed to be the solution in this case. I think that the source of the problem is using multiple CMakeLists.txt for the same target. Maybe this could be fixed by using include instead of add_subdirectory.

slaren323 days ago (edited 323 days ago)

Probably the simplest solution would be to remove the src/CMakeLists.txt entirely and merge them into the base CMakeList.txt. I think that the expectation is that add_subdirectory is only used when a project has multiple targets.

Each backend could be moved to a different target. This will need to be done when making the backends dynamically loadable in the future.

ggerganov323 days ago

Probably the simplest solution would be to remove the src/CMakeLists.txt entirely and merge them into the base CMakeList.txt.

We still need to add_subdirectory(ggml) in the root llama.cpp/CMakeLists.txt - wouldn't it hit the same issue?

I've resolved these warnings in whisper.cpp by enabling CUDA in whisper.cpp/src/CMakeLists.txt: https://github.com/ggerganov/whisper.cpp/blob/f3c609bd2e69b40f7500af8e13c1db270f997403/src/CMakeLists.txt#L80-L101

We can do the same in llama.cpp/src/CMakeLists.txt, but for HIP it seems more complicated because it checks for some compiler compatibility

slaren323 days ago

We still need to add_subdirectory(ggml) in the root llama.cpp/CMakeLists.txt - wouldn't it hit the same issue?

My expectation was that ggml would be built either as a static or a shared library, and then linked to llama.cpp. However this does not seem to be happening because the ggml target is created as an "object" library, which seems to mean that it does not behave as a library at all, and it is more like adding the source files to the parent project. Changing this fixes the CUDA warnings for me.

index 4f146026..f7ecdc2a 100644
--- a/ggml/src/CMakeLists.txt
+++ b/ggml/src/CMakeLists.txt
@@ -1129,7 +1129,7 @@ endif()

 # ggml

-add_library(ggml OBJECT
+add_library(ggml
             ../include/ggml.h
             ../include/ggml-alloc.h
             ../include/ggml-backend.h

Is there any reason to use an OBJECT library?

ggerganov323 days ago

Ah, it wasn't intended to be an object library - I just didn't notice. Nice catch

slaren commented on 2024-06-25

Conversation is marked as resolved

Show resolved

ggerganov force pushed from 31ad52be to 850f45ce 323 days ago

ggerganov force pushed from 850f45ce to 85d6859f 323 days ago

slaren323 days ago👍 3🎉 1

This may also fix ggml being linked multiple times when using shared libraries, which should reduce the size of the pre-build binaries significantly. Currently, the windows CUDA binaries are almost 1 GB in total. Compare the file sizes of my build from this branch with the latest build from github:

slaren commented on 2024-06-25

Conversation is marked as resolved

Show resolved

slaren323 days ago👍 1

There are still some references to LLAMA_CUDA in the Makefile:

diff --git a/Makefile b/Makefile
index 88084cac..80e7678d 100644
--- a/Makefile
+++ b/Makefile
@@ -859,7 +859,7 @@ override NVCCFLAGS := $(MK_NVCCFLAGS) $(NVCCFLAGS)
 override LDFLAGS   := $(MK_LDFLAGS) $(LDFLAGS)

 # identify CUDA host compiler
-ifdef LLAMA_CUDA
+ifdef GGML_CUDA
 GF_CC := $(NVCC) $(NVCCFLAGS) 2>/dev/null .c -Xcompiler
 include scripts/get-flags.mk
 CUDA_CXXFLAGS := $(BASE_CXXFLAGS) $(GF_CXXFLAGS) -Wno-pedantic
@@ -884,7 +884,7 @@ $(info I NVCCFLAGS: $(NVCCFLAGS))
 $(info I LDFLAGS:   $(LDFLAGS))
 $(info I CC:        $(shell $(CC)   --version | head -n 1))
 $(info I CXX:       $(shell $(CXX)  --version | head -n 1))
-ifdef LLAMA_CUDA
+ifdef GGML_CUDA
 $(info I NVCC:      $(shell $(NVCC) --version | tail -n 1))
 CUDA_VERSION := $(shell $(NVCC) --version | grep -oP 'release (\K[0-9]+\.[0-9])')
 ifeq ($(shell awk -v "v=$(CUDA_VERSION)" 'BEGIN { print (v < 11.7) }'),1)
@@ -896,7 +896,7 @@ endif # CUDA_POWER_ARCH
 endif # CUDA_DOCKER_ARCH

 endif # eq ($(shell echo "$(CUDA_VERSION) < 11.7" | bc),1)
-endif # LLAMA_CUDA
+endif # GGML_CUDA
 $(info )

 ifdef DEPRECATE_WARNING

slaren commented on 2024-06-25

Conversation is marked as resolved

Show resolved

Conversation is marked as resolved

Show resolved

ggerganov force pushed from 5a1660ac to 3c1532e0 323 days ago

scripts : update sync [no ci]

ef74eb13

files : relocate [no ci]

98029722

ci : disable kompute build [no ci]

62795ad0

cmake : fixes [no ci]

888d790d

server : fix mingw build

070031df

cmake : minor [no ci]

39783930

cmake : link math library [no ci]

c7c18f35

cmake : build normal ggml library (not object library) [no ci]

723739f0

cmake : fix kompute build

63b75fc4

make,cmake : fix LLAMA_CUDA + replace GGML_CDEF_PRIVATE

0d93f027

move public backend headers to the public include directory (#8122)

22495bfe

ggerganov force pushed from 7d09bfff to 22495bfe 323 days ago

github-actions added testing

scripts : fix sync paths [no ci]

fd1d48a2

ggerganov force pushed from 9f4129a1 to fd1d48a2 323 days ago

scripts : sync ggml-blas.h [no ci]

c11dc915

slaren approved these changes on 2024-06-26

ggerganov merged f3f65429 into master 322 days ago

ggerganov deleted the gg/reorganize-project branch 322 days ago

OuadiElfarouki322 days ago👀 2

@slaren The LLAMA_CUDA_FORCE_CUBLAS cmake option got mistakenly removed but is still used. I believe it's intended to be mutually exclusive with GGML_CUDA_FORCE_MMQ so some changes might be needed (cmake, mmq.cu, ggml-cuda.cu ..)

ggerganov322 days ago (edited 322 days ago)👍 2

There is now GGML_CUDA_FORCE_CUBLAS

Edit: nvm #8140

1169	1170
1170		add_library(ggml_static STATIC $<TARGET_OBJECTS:ggml>)
1171
1172	1171	if (BUILD_SHARED_LIBS)
1173	1172	set_target_properties(ggml PROPERTIES POSITION_INDEPENDENT_CODE ON)
1174		add_library(ggml_shared SHARED $<TARGET_OBJECTS:ggml>)
1175		target_link_libraries(ggml_shared PRIVATE Threads::Threads ${GGML_EXTRA_LIBS})
	1173	target_link_libraries(ggml PRIVATE Threads::Threads ${GGML_EXTRA_LIBS})

llama.cpp
llama : reorganize source code + improve CMake
#8006

Merged

llama : reorganize source code + improve CMake #8006

174	238	MK_CXXFLAGS = -std=c++11 -fPIC
175	239	MK_NVCCFLAGS = -std=c++11
176	240
177		# -Ofast tends to produce faster code, but may not be available for some compilers.
178		ifdef LLAMA_FAST
179		MK_CFLAGS += -Ofast
180		HOST_CXXFLAGS += -Ofast
181		ifndef LLAMA_DEBUG
182		MK_NVCCFLAGS += -O3
183		endif # LLAMA_DEBUG
184		else
185		MK_CFLAGS += -O3
186		MK_CXXFLAGS += -O3
187		ifndef LLAMA_DEBUG
188		MK_NVCCFLAGS += -O3
189		endif # LLAMA_DEBUG
190		endif # LLAMA_FAST
191

33		make LLAMA_BLIS=1 -j
34		# make LLAMA_BLIS=1 benchmark-matmult
	33	make GGML_BLIS=1 -j
	34	# make GGML_BLIS=1 benchmark-matmult

	# make GGML_BLIS=1 benchmark-matmult
	# make GGML_BLIS=1 llama-benchmark-matmult

	310		list(APPEND GGML_CDEF_PRIVATE GGML_CUDA_FORCE_MMQ)
	311		endif()
	312
	313		if (GGML_CUDA_FORCE_CUBLAS)

	add_compile_definitions(GGML_CUDA_FORCE_CUBLAS)
	list(APPEND GGML_CDEF_PRIVATE GGML_CUDA_FORCE_CUBLAS)

llama.cpp llama : reorganize source code + improve CMake #8006 Merged

llama : reorganize source code + improve CMake #8006

llama.cpp
llama : reorganize source code + improve CMake
#8006

Merged