- Apr 09, 2025
-
-
Emil Ohlsson authored
Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
Emil Ohlsson authored
Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
Emil Ohlsson authored
* Rename `matmul` in imatmul interface to `imatmul` * rename `zero` argument in lhs pack to `pad_ptr` * Clarify `k_chunk_length` to mean "in bytes" Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Apr 08, 2025
-
-
Emil Ohlsson authored
Extend the QAI8 testing suite by iterating over clamp rates that will clamp output range to no clamping, clamp 10% of range, and clamp 50% of range Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
Emil Ohlsson authored
There is an issue where the order of static initializations has no guaranteed order, which can cause test listing to be initialized before list of kernels. This can be solved by lazily initialize kernel lists on first use. This patch applies this fix for `matmul_test.cpp` Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Anton Bondarenko <anton.bondarenko@arm.com>
-
- Apr 07, 2025
-
-
Emil Ohlsson authored
For LHS shapes which has more than one row, set the first row of data to be padding. As this further increases the number of different test inputs this change also extends the caching mechanism to use an unordered map to store the generated test data, and uses a single object which encompasses all parameters used to generate test data Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Apr 04, 2025
-
-
Emil Ohlsson authored
Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
Emil Ohlsson authored
As reference data is cached, it's possible to run larger amounts of shapes as long as test data is shared. Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Apr 03, 2025
-
-
Jens Elofsson authored
- Remove designated initializers for matmul_clamp_f32_qai8dxp_qsi4cxp_test to comply with C++17 standard. Signed-off-by:
Jens Elofsson <jens.elofsson@arm.com> Approved-by:
Viet-Hoa Do <viet-hoa.do@arm.com>
-
- Apr 02, 2025
-
-
Viet-Hoa Do authored
* FP16 and BF16 classes are implemented in assembly so the rest of the test framework doesn't need to be compiled with FP16 and BF16 support anymore. It allows the test to be run on system with base architecture. * Remove unnecessary feature guard in kernel header file. The user of our API must not need to compile their code with BF16 support. Signed-off-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
Emil Ohlsson authored
This change adds a IMATMUL version of the QAI8 kernel, as well as changes unit tests to call into this new kernel and adds an interface for this kernel as well Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
Signed-off-by:
Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
Emil Ohlsson authored
There are some TODOs left in code. They should be addressed, but the change is useful as is. There is also an issue where the RHS packing doesn't seem to be working. This is worked around by always packing entire RHS, and not only the part that is needed for the output portion. This need to be investigated before releasing to the wild Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
Emil Ohlsson authored
The kernel require the indirection pointers to be laid out in a packed manner, which need to be indicated by the input type name Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
Emil Ohlsson authored
Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
Signed-off-by:
Felix Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
Signed-off-by:
Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com>
-
Emil Ohlsson authored
Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
- Apr 01, 2025
-
-
Jens Elofsson authored
Removes designated initializers from - matmul_test.cpp - matmul_clamp_f16_bf16p_bf16p_test.cpp - matmul_clamp_f32_bf16p_bf16p_test.cpp Following changes are made to the test framework: - Added default value to data_type in DataFormats constructor - Initialize members of struct MatMulMethod - Add '-Wpedantic' as a build flag to the affected unit tests Signed-off-by:
Jens Elofsson <jens.elofsson@arm.com> Reviewed-by:
Anton Bondarenko <anton.bondarenko@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
Jakub Sujak authored
Although `hw.optional.AdvSIMD` is the replacement for `hw.optional.neon`, this parameter is not always present in different versions of the OS. This may lead to the test suite crashing or tests being erroneously skipped. Instead, we check if the machine supports `hw.optional.arm64` and, if true, we can assume Advanced SIMD support is always present. Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Viet-Hoa Do <viet-hoa.do@arm.com>
-
- Mar 26, 2025
-
-
Jens Elofsson authored
Update all version indicators to 1.6.0. Signed-off-by:
Jens Elofsson <jens.elofsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
Jens Elofsson authored
Signed-off-by:
Jens Elofsson <jens.elofsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
Optimizes this RHS packing by vectorizing the XOR operation. This is done for segment lenghts of 4 or 8 bytes. The unoptimized path is used for any other segment length. Signed-off-by:
Dan Johansson <dan.johansson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
- Mar 21, 2025
-
-
Viet-Hoa Do authored
* The flag is set incorrectly that disables activation function in the GEMV asssembly kernel. * Test is updated to drive the clamping parameters properly. - The clamping parameter is set to reduce 20% the dynamic range of the output. Resolves: KLEIDIAI-545 Signed-off-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Mar 14, 2025
-
-
Jakub Sujak authored
* Alias the KleidiAI target for consistent linking to KleidiAI with `KleidiAI::kleidiai`. * Add install instructions for the KleidiAI library target and its public headers. * Export the KleidiAI target so that projects may use the imported target with `find_package()`. Usage: Install KleidiAI: ``` cmake -S . -B build cmake --build build cmake --install build ``` Once installed, KleidiAI can be imported and linked to using `find_package()`. ``` find_package(KleidiAI CONFIG REQUIRED) target_link_libraries(my_framework PRIVATE KleidiAI::kleidiai) ``` Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Viet-Hoa Do <viet-hoa.do@arm.com>
-
- Mar 12, 2025
-
-
Update all version indicators to 1.5.0. Signed-off-by:
Jens Elofsson <jens.elofsson@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
Fix for reading LHS scale values from in kai_matmul_clamp_f32_qsi8d32p1vlx4_qsi4c32p4vlx4_1vlx4vl_sme2_mopa Fix the out-of-bounds read while loading the scale values from LHS packed matrix in \`kai_matmul_clamp_f32_qsi8d32p1vlx4_qsi4c32p4vlx4_1vlx4vl_sme2_mopa\` by updating the predicate Resolves: KLEIDIAI-507 Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
- Mar 11, 2025
-
-
Build system robustness improved by several methods: * Mark standard 'build' folder as ignored. This helps when doing different builds from a same folder * Combine source files for assembler kernels in same targets * Add sorting for new kernel lists * Relax clean step in CI for faster builds Signed-off-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
Jens Elofsson authored
Signed-off-by:
Jens Elofsson <jens.elofsson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
- Mar 07, 2025
-
-
Anton Bondarenko authored
Analyzing skip test w/o a proper report message is hard. Providing more details helps with that. Signed-off-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
Emil Ohlsson authored
A couple of cleanups were done while adding support for QAI8 GEMV, these have been moved out to this patch * Sorts file lists in `CMakeLists.txt` * Add additional test shapes * Minor readability tweaks Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Anton Bondarenko <anton.bondarenko@arm.com>
-
- Mar 05, 2025
-
-
Jens Elofsson authored
This flag have been removed from CMakeLists.txt, but accidentally left in kai_defs.bzl. Signed-off-by:
Jens Elofsson <jens.elofsson@arm.com> Approved-by:
Anton Bondarenko <anton.bondarenko@arm.com>
-
- Feb 27, 2025
-
-
Jens Elofsson authored
Change type of rhs_zero_point to uint8_t to match the data type in the kai_rhs_pack_qs4cxs1s0_param-struct. Signed-off-by:
Jens Elofsson <jens.elofsson@arm.com> Approved-by:
Anton Bondarenko <anton.bondarenko@arm.com>
-
- Feb 26, 2025
-
-
Jens Elofsson authored
The argument to std::mt19937:s constructor is uint32_t, but the supplied value (the variable "seed") was uint64_t. This has been changed to uint32_t. Signed-off-by:
Jens Elofsson <jens.elofsson@arm.com> Approved-by:
Anton Bondarenko <anton.bondarenko@arm.com>
-
- Feb 24, 2025
-
-
* Refactor the benchmark tool to create a generic abstraction that allows for running matrix multiplication micro-kernels with different interfaces. * Extend benchmark support to all matrix multiplication micro-kernels in the library. Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
This flag is a stylistic option in GCC and does not add to security hardening. Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Feb 20, 2025
-
-
- Add new assembly ukernel optimized with FEAT_I8MM for matrix multiplication with 4x8 block size. - Update build script. - Add to unit test. Signed-off-by:
Michael Kozlov <michael.kozlov@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Feb 18, 2025
-
-
Signed-off-by:
Jens Elofsson <jens.elofsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
Jens Elofsson authored
Update all version indicators to 1.4.0. Signed-off-by:
Jens Elofsson <jens.elofsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
Signed-off-by:
Jens Elofsson <jens.elofsson@arm.com> Reviewed-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-