- Dec 17, 2024
-
-
Signed-off-by:
Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
- Dec 13, 2024
-
-
Emil Ohlsson authored
Changes reported release version, and updates changelog since last release Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
There are a few items missing in the changelog since last relase. This commit updates the list with recent changes. Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Optimize the generic RHS packing NxK. The performance improvement is around ~1.5x Signed-off-by:
Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Anton Bondarenko <anton.bondarenko@arm.com>
-
Jakub Sujak authored
Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Reviewed-by:
alankelly <alankelly@google.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
Original way of organizing matmul files was a bit complicated. A simpler way would reduce maintenance and simplify on how to add new microkernels Signed-off-by:
Anton Bondarenko <anton.bondarenko@arm.com> Reviewed-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Dec 11, 2024
-
-
Jakub Sujak authored
The `__fp16` type is known to be troublesome for certain build environments. Instead use type `float` in the interface, and in the implementation use types `float16_t` (an alias for `__fp16`) for casting or `uint16_t` for pointer arithmetic. Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
Felix Johnny Thomasmathibalan authored
Indicate transition to following Semantic Versioning for future releases. Signed-off-by:
Felix Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
- Dec 10, 2024
-
-
Jakub Sujak authored
Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
Anton Bondarenko authored
Version 8 of Bazel introduces explicit requirement to select WORKSPACE file as source of external dependencies. While it is quite new there is still a need to support version 6.5 where new flag cause build failure. Signed-off-by:
Anton Bondarenko <anton.bondarenko@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
- Dec 09, 2024
-
-
Felix Johnny Thomasmathibalan authored
Signed-off-by:
Felix Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
- Add more shapes - Add pretty printer for test suite Signed-off-by:
Michael Kozlov <michael.kozlov@arm.com> Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
Anton Bondarenko authored
Usage of compiled headers in a mix of C and C++ code could lead to compilation errors where C only flags are used. In this case headers should be interpreted together with source files using them. Signed-off-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
Viet-Hoa Do authored
* Only the following parts are included in the MSVC build: - All scalar kernels. - Half-precision floating-point reference implementation. * Add tests. Signed-off-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Anton Bondarenko <anton.bondarenko@arm.com>
-
- Dec 06, 2024
-
-
Signed-off-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Dec 05, 2024
-
-
Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Approved-by:
Anton Bondarenko <anton.bondarenko@arm.com>
-
The different kernels does a lot of parameter checking, and one of the checks for the FP16 kernels is actually incorrect. This change addresses one error of this kind in the fp16 lhs packing kernel The same issue is also updated in the fp16 documentation Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Reviewed-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
Emil Ohlsson authored
The current KAI_ASSERT() function will exit the program using `exit(EXIT_FAILURE)` which is unhelpful when running the program under a debugger, as this will not trap the execution. This change changes the call to `abort()` instead Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Dec 02, 2024
-
-
Signed-off-by:
Gian Marco Iodice <gianmarco.iodice@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- GEMM and GEMV Micro-kernels to compute the matrix multiplication of dynamically quantized symmetric signed 8-bit integer with per-block quantization (QSI8D32) LHS matrix and quantized symmetric 4-bit signed integer with per-block quantization (QSI4C32) RHS matrix and the accumulation of the result into a single-precision (F32) output, optimized for SME2 technology. Signed-off-by:
Gian Marco Iodice <gianmarco.iodice@arm.com> Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Reviewed-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Nov 29, 2024
-
-
Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Anton Bondarenko <anton.bondarenko@arm.com>
-
Signed-off-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
Anton Bondarenko authored
Signed-off-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
Signed-off-by:
Jens Elofsson <jens.elofsson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
Suhail M authored
* Add the SME Int8 GEMM set of microkernels: - LHS packing kernel. - Non-transposed RHS packing kernel. - Main kernel. * Update the test framework to support static int8 GEMM. Resolves: KLEIDIAI-171, KLEIDIAI-235, KLEIDIAI-39 Signed-off-by:
Mohammed Suhail Munshi <MohammedSuhail.Munshi@arm.com> Signed-off-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Reviewed-by:
Mohammed Suhail Munshi <mohammedsuhail.munshi@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
Anton Bondarenko authored
Current solution for SME microkernels support is to use SME opcodes and SVE instruction in streaming mode. And a precondition for compiler to understand SVE instructions is compilation with -march=...+sve+sve2. However this allows compiler to generate own SVE instructions for normal C/C++ code. And might cause illegal instruction exception on CPUs where SME implemented w/o SVE. In this case we want to disable usage of compiler generated SVE instructions. Test no SVE instructions using FVP with disabled SVE support. Signed-off-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
Signed-off-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
RHS pack is required. LHS pack is not required Signed-off-by:
Felix Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Reviewed-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
Jakub Sujak authored
Regenerate the SME2 GEMV micro-kernel assembly so that it is contained within the SMSTART/SMSTOP boundary, preventing illegal instruction faults when attempting to execute streaming SVE code on a system without SVE support. Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
This commit * Adds bf16 x bf16 = fp16 matmul microkernel with 8x12 output block size * Lhs/Rhs packing functions that packs and converts the inputs from fp16 to bf16 * Corresponding tests, and modifications to the testing framework, and reference implementation Signed-off-by:
Gunes Bayir <gunes.bayir@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Jakub Sujak <jakub.sujak@arm.com> Reviewed-by:
Gunes Bayir <gunes.bayir@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
Felix Johnny Thomasmathibalan authored
Affected micro kernel: FP16 GEMM, SME2 Signed-off-by:
Felix Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
- Nov 28, 2024
-
-
Jens Elofsson authored
Signed-off-by:
Jens Elofsson <jens.elofsson@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
- Add GeMM-like micro-kernels - Add GeMV-like micro-kernels Signed-off-by:
Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
This change fixes a minor copy paste error in the kernel interfaces. Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Nov 27, 2024
-
-
Signed-off-by:
Michael Kozlov <michael.kozlov@arm.com> Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Anton Bondarenko <anton.bondarenko@arm.com>
-
- Nov 21, 2024
-
-
Add fp16 kernels for LHS and RHS packing, and matmul. Also add related unit tests for said kernels, and extend unit Matmul tests to support calling fp16 kernels. Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Nov 20, 2024
-
-
Emil Ohlsson authored
KleidiAI is intended to target certain build environments, this means that KleidiAI should be buildable using CMake version 3.16 Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Nov 19, 2024
-
-
* Round off the odd strides for the int4 RHS by padding with 0s Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-