Matmul Micro-kernels F32 <- QAI8DXP(LHS) x QSI8CXP(RHS) optimized for SME (3d8217c2) · Commits · Kleidi / KleidiAI

Commit 3d8217c2 authored May 30, 2025 by Anitha Raj Committed by Felix Johnny Thomasmathibalan May 30, 2025

Matmul Micro-kernels F32 <- QAI8DXP(LHS) x QSI8CXP(RHS) optimized for SME



* Micro-kernels (1xN) to compute the matrix multiplication of dynamically quantized asymmetric 8-bit integer with per-channel quantization (QAI8DX) LHS matrix and quantized symmetric 8-bit integer with per-channel quantization (QSI8CX) RHS matrix and the accumulation of the result into a single-precision (F32) output, optimized for SME2 technology.
*  Micro-kernels (MxN) to compute the matrix multiplication of dynamically quantized asymmetric 8-bit integer with per-channel quantization (QAI8DX) LHS matrix and quantized symmetric 8-bit integer with per-channel quantization (QSI8CX) RHS matrix and the accumulation of the result into a single-precision (F32) output, optimized for SME2 technology.

Signed-off-by: Anitha Raj <anitha.raj@arm.com>

Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Reviewed-by: Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
Reviewed-by: Anton Bondarenko <anton.bondarenko@arm.com>
Approved-by: Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>

parent bf64dadd

Pipeline #26736 passed with stages

in 9 minutes and 24 seconds

Hide whitespace changes

Inline Side-by-side

Please register or to comment