Matmul int4 micro-kernels for QA8DX (LHS) x QS4CX (RHS) -> F32
- The LHS matrix is quantized (Q) Asymmetric (A) 8-bit (8) with per-row (DX) quantization parameters - The RHS matrix is quantized (Q) Symmetric (S) 4-bit (4) with per-channel (cx) quantization parameters - The destination is F32 - Implement matmul int4 micro-kernels with intrinsics by using the dotprod and i8mm extensions - Implement a micro-kernel to pack the RHS matrix - Implement two micro-kernels to dynamically quantize and pack the LHS matrix - Add README.md - No test added into this PR. Test will be added in a separate PR Signed-off-by:Gian Marco Iodice <gianmarco.iodice@arm.com> Reviewed-by:
Viet-Hoa Do <viet-hoa.do@arm.com> Reviewed-by:
Jakub Sujak <jakub.sujak@arm.com>