- Jan 30, 2025
-
-
Signed-off-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Jan 29, 2025
-
-
Update all version indications Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Jan 28, 2025
-
-
Signed-off-by:
Michael Kozlov <michael.kozlov@arm.com> Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Jan 27, 2025
-
-
In kai_matmul_clamp_f32_qai8dxp1vlx8_qsi4cxp4vlx8_1vlx4vl_sme2_mopa: * Fix the offset calculation * Fix pointer increments in the matmul Add new shapes to unit tests, to test n > 64 Resolves: #KLEIDIAI-405, #COMPMID-7918 Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
Emil Ohlsson authored
Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Jan 23, 2025
-
-
Emil Ohlsson authored
Downstream projects might not build with same toolchain, which might trigger warnings. Instead of treating these warnings as errors, using this patch they are only warnings. Also, add `-Werror` to the build flags to our testing pipeline, because we do not want to accidentally introduce new warnings by mistake Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Jan 22, 2025
-
-
Jakub Sujak authored
Mention recent fixes for compiler warnings produced by enabling the `-Wcast-qual -Wmissing-prototypes -Wstrict-prototypes -Woverlength-strings` compiler options. Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
- Jan 16, 2025
-
-
Emil Ohlsson authored
Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Previously, we were dividing m_idx/n_idx by m_step/n_step. However, since m_step/n_step can be different from mr/nr, we should divide by mr/nr instead Signed-off-by:
Gian Marco Iodice <gianmarco.iodice@arm.com> Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
-
- Jan 10, 2025
-
-
Emil Ohlsson authored
Describe changes since 1.1.0 Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Dec 30, 2024
-
-
Jakub Sujak authored
Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Dec 24, 2024
-
-
- Add unit test Signed-off-by:
Michael Kozlov <michael.kozlov@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
* GEMM and GEMV micro-kernels to compute the matrix multiplication of dynamically quantized 8-bit integer (QAI8DX) LHS matrix and quantized 4-bit integer (QSI4CX) RHS matrix and the accumulation of the result into a single-precision (F32) output, optimized for SME2 technology. Signed-off-by:
Mohamad Najem <mohamad.najem@arm.com> Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Signed-off-by:
Michael Kozlov <michael.kozlov@arm.com> Signed-off-by:
Thomas Bamelis <thomas.bamelis@arm.com> Reviewed-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Anton Bondarenko <anton.bondarenko@arm.com> Reviewed-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Dec 13, 2024
-
-
Emil Ohlsson authored
Changes reported release version, and updates changelog since last release Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
There are a few items missing in the changelog since last relase. This commit updates the list with recent changes. Signed-off-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Optimize the generic RHS packing NxK. The performance improvement is around ~1.5x Signed-off-by:
Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Anton Bondarenko <anton.bondarenko@arm.com>
-
Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Reviewed-by:
alankelly <alankelly@google.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Dec 11, 2024
-
-
Jakub Sujak authored
The `__fp16` type is known to be troublesome for certain build environments. Instead use type `float` in the interface, and in the implementation use types `float16_t` (an alias for `__fp16`) for casting or `uint16_t` for pointer arithmetic. Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Dec 02, 2024
-
-
- GEMM and GEMV Micro-kernels to compute the matrix multiplication of dynamically quantized symmetric signed 8-bit integer with per-block quantization (QSI8D32) LHS matrix and quantized symmetric 4-bit signed integer with per-block quantization (QSI4C32) RHS matrix and the accumulation of the result into a single-precision (F32) output, optimized for SME2 technology. Signed-off-by:
Gian Marco Iodice <gianmarco.iodice@arm.com> Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Reviewed-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Nov 29, 2024
-
-
Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Anton Bondarenko <anton.bondarenko@arm.com>
-
- Nov 28, 2024
-
-
- Add GeMM-like micro-kernels - Add GeMV-like micro-kernels Signed-off-by:
Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Nov 27, 2024
-
-
Signed-off-by:
Michael Kozlov <michael.kozlov@arm.com> Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Anton Bondarenko <anton.bondarenko@arm.com>
-
- Nov 04, 2024
-
-
Jakub Sujak authored
Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Reviewed-by:
Anton Bondarenko <anton.bondarenko@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Oct 16, 2024
-
-
Jakub Sujak authored
Compute the Vector-Matrix multiply of F32 inputs to produce an F32 matrix, optimized using SME instructions. Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Oct 14, 2024
-
-
Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Anitha Raj <anitha.raj@arm.com> Approved-by:
Gian Marco Iodice <gianmarco.iodice@arm.com>
-
- Sep 27, 2024
-
-
Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
Felix Johnny Thomasmathibalan authored
Signed-off-by:
Felix Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Sep 25, 2024
-
-
This reverts commit f4f59599 The existing FP32 GEMM micro-kernel (matmul_clamp_f32_f32_f32p8x1biasf32_6x8x4_neon_mla) has a dedicated path for M=1 (a "GEMV" operation). Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Sep 12, 2024
-
-
- The LHS matrix is Quantized (Q) Asymmetric (A) Signed 8-bit (I8) with per-row (DX) quantization parameters - The RHS matrix is quantized (Q) Symmetric (S) Signed 4-bit (I4) with per-block quantization - The destination is F32 - Implement micro-kernels to perform the matrix multiplication - Implement a micro-kernel to pack the RHS matrix Signed-off-by:
Gian Marco Iodice <gianmarco.iodice@arm.com> Signed-off-by:
Anitha Raj <anitha.raj@arm.com> Signed-off-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Anitha Raj <anitha.raj@arm.com> Reviewed-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Reviewed-by:
Jakub Sujak <jakub.sujak@arm.com> Reviewed-by:
Max Ren <maxren@meta.com> Approved-by:
Viet-Hoa Do <viet-hoa.do@arm.com>
-
- Aug 30, 2024
-
-
Signed-off-by:
Felix Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Aug 19, 2024
-
-
Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-
- Aug 16, 2024
-
-
* The LHS matrix is Quantized (Q) Symmetric (S) Signed 8-bit (I8) with per-block quantization (D32) quantization parameters * The RHS matrix is Quantized (Q) Symmetric (S) Signed 4-bit (I4) with per-block quantization(C32) F16 scale factors, * The destination is F32 * Implement micro-kernels to perform the matrix multiplication * Implement a micro-kernel to pack the LHS and RHS matrices * Added unit tests Signed-off-by:
Gian Marco <Iodice gianmarco.iodice@arm.com> Signed-off-by:
Anitha <Raj Anitha.Raj@arm.com> Signed-off-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Reviewed-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Jakub Sujak <jakub.sujak@arm.com>
-
- Jul 15, 2024
-
-
Felix Johnny Thomasmathibalan authored
Signed-off-by:
Felix Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Approved-by:
Anton Bondarenko <anton.bondarenko@arm.com>
-
- Jul 05, 2024
-
-
Jakub Sujak authored
Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Approved-by:
Anton Bondarenko <anton.bondarenko@arm.com>
-
- Jul 04, 2024
-
-
Jakub Sujak authored
Signed-off-by:
Jakub Sujak <jakub.sujak@arm.com> Reviewed-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com> Approved-by:
Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
-