Draft: Matmul Micro-kernels F32<- (QSI8DX) LHS x (QAI4CX) RHS (!348) · Merge requests · Kleidi / KleidiAI

Anitha Raj requested to merge f32_qsi8dx_qai4cx into main Apr 10, 2025

Micro-kernels to compute the matrix multiplication of dynamically quantized symmetric signed 8-bit integer with per-channel quantization (QSI8DX) LHS matrix and quantized asymmetric 4-bit signed integer with per-channel quantization (QAI4CX) RHS matrix and the accumulation of the result into a single-precision (F32):

Matrix multiplication (MxN) Micro-kernels of QSI8DX LHS and QAI4CX RHS with F32 output, optimized for FEAT_I8MM.
Matrix multiplication (1xN) Micro-kernels of QSI8DX LHS and QAI4CX RHS with F32 output, optimized for FEAT_DotProd.

Signed-off-by: Anitha Raj anitha.raj@arm.com

Draft: Matmul Micro-kernels F32<- (QSI8DX) LHS x (QAI4CX) RHS

Merge request reports