Skip to content
Commit 3a9a21ca authored by Anitha Raj's avatar Anitha Raj Committed by Jakub Sujak
Browse files

Optimize F32 <- QAI8DXP (LHS) x QSI4CXP (RHS) for SME



* GEMM and GEMV micro-kernels to compute the matrix multiplication of dynamically quantized 8-bit integer (QAI8DX) LHS matrix and quantized 4-bit integer (QSI4CX) RHS matrix and the accumulation of the result into a single-precision (F32) output, optimized for SME2 technology.

Signed-off-by: Mohamad Najem's avatarMohamad Najem <mohamad.najem@arm.com>

Signed-off-by: Anitha Raj's avatarAnitha Raj <anitha.raj@arm.com>

Signed-off-by: Michael Kozlov's avatarMichael Kozlov <michael.kozlov@arm.com>

Signed-off-by: Thomas Bamelis's avatarThomas Bamelis <thomas.bamelis@arm.com>

Reviewed-by: Anitha Raj's avatarAnitha Raj <anitha.raj@arm.com>
Reviewed-by: Anton Bondarenko's avatarAnton Bondarenko <anton.bondarenko@arm.com>
Reviewed-by: Jakub Sujak's avatarJakub Sujak <jakub.sujak@arm.com>
Approved-by: Jakub Sujak's avatarJakub Sujak <jakub.sujak@arm.com>
parent 1fbcf6b4
Loading
Loading
Loading
Pipeline #18449 passed with stages
in 3 minutes and 11 seconds
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment