Skip to content
Commit b27875d9 authored by Anitha Raj's avatar Anitha Raj Committed by Anton Bondarenko
Browse files

Matmul Micro-kernels F32/F16 <- (QSI8D32) LHS x (QAI4C32) RHS



Micro-kernels to compute the matrix multiplication of dynamically quantized symmetric signed 8-bit integer with per-block quantization (QSI8D32) LHS matrix and quantized asymmetric 4-bit signed integer with per-block quantization (QAI4C32) RHS matrix and the accumulation of the result into a single-precision (F32) and half-precision (F16) output:

- Matrix multiplication (MxN) Micro-kernels of QSI8D32 LHS and QAI4C32 RHS with F32 output, optimized for FEAT_I8MM.
- Matrix multiplication (1xN) Micro-kernels of QSI8D32 LHS and QAI4C32 RHS with F32 output, optimized for FEAT_DotProd.
- Matrix multiplication (MxN) Micro-kernels of QSI8D32 LHS and QAI4C32 RHS with F16 output, optimized for FEAT_I8MM.
- Matrix multiplication (1xN) Micro-kernels of QSI8D32 LHS and QAI4C32 RHS with F16 output, optimized for FEAT_DotProd.

Signed-off-by: Anitha Raj's avatarAnitha Raj <anitha.raj@arm.com>

Reviewed-by: Viet-Hoa Do's avatarViet-Hoa Do <viet-hoa.do@arm.com>
Reviewed-by: Anitha Raj's avatarAnitha Raj <anitha.raj@arm.com>
Reviewed-by: Anton Bondarenko's avatarAnton Bondarenko <anton.bondarenko@arm.com>
Approved-by: Anton Bondarenko's avatarAnton Bondarenko <anton.bondarenko@arm.com>
parent 22c47616
Loading
Loading
Loading
Pipeline #25376 passed with stages
in 6 minutes and 39 seconds
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment