From 1ef22da78479600bd08262a3e50d9de55b001eb6 Mon Sep 17 00:00:00 2001 From: Jens Elofsson Date: Fri, 7 Mar 2025 11:07:05 +0100 Subject: [PATCH 1/3] Update the changelog with new changes. Signed-off-by: Jens Elofsson --- CHANGELOG.md | 1 + 1 file changed, 1 insertion(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index b664057e..fbdada29 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -11,6 +11,7 @@ KleidiAI follows the [Semantic Versioning](https://semver.org/) specification fo ## Upcoming Release - Extend benchmark tool to support all matrix multiplication micro-kernels. +- Add new assembly ukernel optimized with FEAT_I8MM for matrix multiplication with 4x8 block size. ## v1.4.0 -- GitLab From 1bdd42fe916a3cf9f9a48342cceda9800e86ae52 Mon Sep 17 00:00:00 2001 From: Jens Elofsson Date: Fri, 7 Mar 2025 13:38:41 +0100 Subject: [PATCH 2/3] Add additional changes. Signed-off-by: Jens Elofsson --- CHANGELOG.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index fbdada29..047aa6ed 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -12,6 +12,8 @@ KleidiAI follows the [Semantic Versioning](https://semver.org/) specification fo - Extend benchmark tool to support all matrix multiplication micro-kernels. - Add new assembly ukernel optimized with FEAT_I8MM for matrix multiplication with 4x8 block size. +- Fixes: + - Remove "-Weffc++" from build flags ## v1.4.0 -- GitLab From e25740c9220d6fb8f6b1897bb9fdaab24d5897bc Mon Sep 17 00:00:00 2001 From: Jens Elofsson Date: Mon, 10 Mar 2025 10:36:37 +0100 Subject: [PATCH 3/3] Address review comments - Move entry from 1.4.0 to 1.5.0 that was added after 1.4.0 release was done. Signed-off-by: Jens Elofsson --- CHANGELOG.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 047aa6ed..c662ed3c 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -11,7 +11,9 @@ KleidiAI follows the [Semantic Versioning](https://semver.org/) specification fo ## Upcoming Release - Extend benchmark tool to support all matrix multiplication micro-kernels. -- Add new assembly ukernel optimized with FEAT_I8MM for matrix multiplication with 4x8 block size. +- New Advanced SIMD micro-kernels: + - New 4x8 block size variant of matrix multiplication of QAI8DXP LHS and QSI4C32P RHS with F32 output. + - Optimizations for FEAT_I8MM. - Fixes: - Remove "-Weffc++" from build flags @@ -22,8 +24,6 @@ KleidiAI follows the [Semantic Versioning](https://semver.org/) specification fo - Optimizations for FEAT_DotProd. - New 1x8 block size variant of matrix multiplication of QAI8DXP LHS and QSI4C32P RHS with F32 output. - Optimizations for FEAT_DotProd. - - New 4x8 block size variant of matrix multiplication of QAI8DXP LHS and QSI4C32P RHS with F32 output. - - Optimizations for FEAT_I8MM. - New 1x8 block size variant of matrix multiplication of QAI8DXP 1x8 LHS and QSI4C32P 8x8 RHS with F32 output. - Optimizations for FEAT_DotProd. - New SME2 micro-kernels: -- GitLab