# Speech to Text library * [Speech to Text library](#speech-to-text-library) * [Prerequisites](#prerequisites) * [Configuration options](#configuration-options) * [Quick start](#quick-start) * [Neural network](#neural-network) * [To Build for Android with KleidiAI kernels](#to-build-for-android-with-kleidiai-kernels) * [To Build for Android without KleidiAI Kernels](#to-build-for-android-without-kleidiai-kernels) * [To test the above build please do the following:](#to-test-the-above-build-please-do-the-following) * [To Build for Linux (aarch64) with KleidiAI and SME kernels](#to-build-for-linux-aarch64_with-kleidiai-and-sme-kernels) * [To Build for Linux (aarch64) without KleidiAI Kernels](#to-build-for-linux-aarch64_without-kleidiai-kernels) * [To Build for native host](#to-build-for-native-host) * [To Build for macOS](#to-build-for-macos) * [Building and running tests](#building-and-running-tests) * [To Build a test executable](#to-build-a-test-executable) * [To Build a JNI lib](#to-build-a-jni-lib) * [Known Issues](#known-issues) * [Trademarks](#trademarks) * [License](#license) This repo is designed for building an [Arm® KleidiAI™](https://www.arm.com/markets/artificial-intelligence/software/kleidi) enabled STT library using CMake build system. It intends to provide an abstraction for [whisper.cpp](https://github.com/ggml-org/whisper.cpp) framework and it has Arm® KleidiAI™ backend available. In future, we **may** add support for other frameworks and models. The backend library (selected at CMake configuration stage) is wrapped by this project's thin C++ layer that could be used directly for testing and evaluations. However, JNI bindings are also provided for developers targeting Android™ based applications. ## Prerequisites * CMake 3.27 or above installed * Android™ NDK v27.1.12297006 (if building for Android™) * Python 3.9 or above installed, python is used to download test resources and models * NDK_PATH set to point at the install location of the Android™ NDK * aarch64 toolchain (Tested with v11.2-2022.02) ## Configuration options The project is designed to download the required software sources based on user provided configuration options. - `STT_DEP_NAME`: Can be `whisper.cpp` only in current implementation. Other options may be added later. - `WHISPER_SRC_DIR`: Path to the local source directory for the `whisper.cpp` dependency. - `WHISPER_GIT_URL`: Git repository URL used to clone the `whisper.cpp` dependency. - `WHISPER_GIT_TAG`: Specific git tag to use for the `whisper.cpp` dependency. ## Quick start By default, the JNI builds are enabled, and Arm® KleidiAI™ kernels are enabled on arm64/aarch64. To disable these, configure with: `-DGGML_CPU_KLEIDIAI=OFF`. ### Neural network This project uses the **ggml-base.en model** as its default network. The model is not quantized. To strike a balance between computational efficiency and model performance, you can use the **Q4_0 quantization format**. To quantize your model, you can use the [whisper.cpp quantize tool](https://github.com/ggml-org/whisper.cpp/tree/v1.7.4?tab=readme-ov-file#quantization). - You can access the model from [Hugging Face](https://huggingface.co/ggerganov/whisper.cpp/blob/main/ggml-base.en.bin). - The default model configuration is declared in the [`requirements.json`](scripts/py/requirements.json) file. However, any model supported by the backend library could be used. > **NOTE**: Currently only Q4_0 models are accelerated by Arm® KleidiAI™ kernels in whisper.cpp. ### To Build for Android with KleidiAI kernels ```shell cmake -B build \ -DCMAKE_TOOLCHAIN_FILE=${NDK_PATH}/build/cmake/android.toolchain.cmake \ -DANDROID_ABI=arm64-v8a \ -DANDROID_PLATFORM=android-33 \ -DCMAKE_C_FLAGS=-march=armv8.2a+i8mm+dotprod+fp16 \ -DCMAKE_CXX_FLAGS=-march=armv8.2a+i8mm+dotprod+fp16 \ -DBUILD_SHARED_LIBS=false \ -DTEST_DATA_DIR="/data/local/tmp" \ -DTEST_MODELS_DIR="/data/local/tmp" \ -DGGML_OPENMP=OFF cmake --build ./build ``` ### To Build for Android without KleidiAI Kernels ```shell cmake -B build \ -DCMAKE_TOOLCHAIN_FILE=${NDK_PATH}/build/cmake/android.toolchain.cmake \ -DANDROID_ABI=arm64-v8a \ -DANDROID_PLATFORM=android-33 \ -DCMAKE_C_FLAGS=-march=armv8.2a+i8mm+dotprod+fp16 \ -DCMAKE_CXX_FLAGS=-march=armv8.2a+i8mm+dotprod+fp16 \ -DBUILD_SHARED_LIBS=true \ -DGGML_CPU_KLEIDIAI=OFF \ -DTEST_DATA_DIR="/data/local/tmp" \ -DTEST_MODELS_DIR="/data/local/tmp" \ -DGGML_OPENMP=OFF cmake --build ./build ``` #### To test the above build please do the following: Firstly push the newly built test executable: ```shell adb push build/bin/stt-cpp-tests /data/local/tmp ``` Next push the newly built libs: ```shell adb push build/lib/* /data/local/tmp ``` Next you will need to shell to your device: ```shell adb shell cd data/local/tmp export LD_LIBRARY_PATH=./ ``` Finally just run the test executable: ```shell ./stt-cpp-tests ``` ### To Build for Linux (aarch64) with KleidiAI and SME kernels To build with SME kernels, ensure `GGML_CPU_ARM_ARCH` is set with needed feature flags as below: ```shell cmake -B build \ --preset=elinux-aarch64-release-with-tests \ -DGGML_CPU_ARM_ARCH=armv8.2-a+dotprod+i8mm+sve+sme+fp16 \ -DGGML_CPU_KLEIDIAI=ON \ -DBUILD_EXECUTABLE=ON cmake --build ./build ``` Once built, a standalone application can be executed to get performance. Set `GGML_KLEIDIAI_SME=1` to enable the use of SME kernels during execution: ```shell GGML_KLEIDIAI_SME=1 ./build/bin/whisper-cli -m resources_downloaded/models/model.bin /path/to/audio/audiofile.wav ``` To run without invoking SME kernels, set `GGML_KLEIDIAI_SME=0` during execution: ```shell GGML_KLEIDIAI_SME=0 ./build/bin/whisper-cli -m resources_downloaded/models/model.bin /path/to/audio/audiofile.wav ``` ### To Build for Linux (aarch64) without KleidiAI Kernels ```shell cmake -B build \ -DCMAKE_TOOLCHAIN_FILE=cmake/toolchains/aarch64.cmake \ -DCMAKE_C_FLAGS=-march=armv8.2-a+dotprod+i8mm+fp16 \ -DCMAKE_CXX_FLAGS=-march=armv8.2-a+dotprod+i8mm+fp16 \ -DGGML_CPU_KLEIDIAI=OFF cmake --build ./build ``` ### To Build for native host ```shell cmake -B build --preset=native-release-with-tests cmake --build ./build ``` ### To Build for macOS To build for the CPU backend on macOS®, you can use the native CMake toolchain. However, additional flags are required to explicitly disable the default backends: ```shell cmake -B build --preset=native-release-with-tests \ -DCMAKE_C_FLAGS=-march=armv8.2-a+dotprod+i8mm+fp16 \ -DCMAKE_CXX_FLAGS=-march=armv8.2-a+dotprod+i8mm+fp16 \ -DGGML_METAL=OFF \ -DGGML_BLAS=OFF cmake --build ./build ``` ## Building and running tests To build and test for native host machine: ```shell cmake -B build --preset=native-release-with-tests cmake --build ./build ctest --test-dir ./build ``` ### To Build a test executable Add the option: ``` -DBUILD_EXECUTABLE=true ``` to any of the build commands above. For example ```shell cmake -B build \ -DCMAKE_TOOLCHAIN_FILE=scripts/cmake/toolchains/aarch64.cmake \ -DCMAKE_C_FLAGS=-march=armv8.2-a+dotprod+i8mm+fp16 \ -DCMAKE_CXX_FLAGS=-march=armv8.2-a+dotprod+i8mm+fp16 \ -DBUILD_EXECUTABLE=true cmake --build ./build ``` This will produce an executable, which you can use to test your build under : ``` /build/bin/whisper-cli ``` You can run this executable and test an audio file using the following: ``` ./whisper-cli -m resources_downloaded/models/model.bin /path/to/audio/audiofile.wav ``` ### To Build a JNI lib Add the options: ``` -DBUILD_JNI_LIB=true ``` to run the sample test `WhisperTestApp.java` run the following commands post-build ``` ctest --test-dir ./build ``` ## Known Issues **Transcription speed** - We are working on improving this, approximately 2x slower than expected currently ## Trademarks * Arm® and KleidiAI™ are registered trademarks or trademarks of Arm® Limited (or its subsidiaries) in the US and/or elsewhere. * Android™ is a trademark of Google LLC. * macOS® is a trademark of Apple Inc. ## License This project is distributed under the software licenses in [LICENSES](LICENSES) directory.