Commits · 2e266e487ecbe3b1a51460b458027ddc40cd3384 · Kleidi / KleidiAI

May 23, 2024

Matmul int4 micro-kernels for QA8DX (LHS) x QS4CX (RHS) -> F32 · c6835819

Gian Marco Iodice authored May 23, 2024 and

Felix Johnny Thomasmathibalan committed May 23, 2024



- The LHS matrix is quantized (Q) Asymmetric (A) 8-bit (8) with per-row (DX) quantization parameters
- The RHS matrix is quantized (Q) Symmetric (S) 4-bit (4) with per-channel (cx) quantization parameters
- The destination is F32
- Implement matmul int4 micro-kernels with intrinsics by using the dotprod and i8mm extensions
- Implement a micro-kernel to pack the RHS matrix
- Implement two micro-kernels to dynamically quantize and pack the LHS matrix
- Add README.md
- No test added into this PR. Test will be added in a separate PR

Signed-off-by: Gian Marco Iodice <gianmarco.iodice@arm.com>

Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>

c6835819

Fix copyright notice in README.md file · 9e4c2ffa

Viet-Hoa Do authored May 23, 2024



* Also remove trailing whitespace.

Signed-off-by: Viet-Hoa Do <viet-hoa.do@arm.com>

Approved-by: Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>

9e4c2ffa

Apr 09, 2024

Add README.md · b8bbdb62

Felix Johnny Thomasmathibalan authored Apr 09, 2024 and

Jakub Sujak committed Apr 09, 2024



Content is taken from a pending Pull Request

Signed-off-by: Felix Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
Co-authored-by: Gian Marco Iodice <gianmarco.iodice@arm.com>

Approved-by: Jakub Sujak <jakub.sujak@arm.com>

b8bbdb62