Skip to content

Add QAI8 IGEMM kernels

Emil Ohlsson requested to merge feature/int8-igemm into main

This change introduces three new kernels:

  • kai_imatmul_clamp_qai8_qai8p2vlx4_qsi8cxpsb2vlx4_2vlx2vl_sme2_mopa
  • kai_lhs_imatmul_pack_x8p2vlx4_x8p_sme
  • kai_rhs_imatmul_pack_kxn_qsi8cxp2vlx4sb_qs8cx_f32_i32_sme

These kernels are used for indirect matmul. The big difference between these kernels and matmul kernels is that the LHS packing kernel takes an indirection buffer where each pointer refers to a chunk in K dimension. The pointers are laid out in a packed manner, where instead of being in row major order, a column of get_m_step chunk pointers are placed linearly in indirection buffer.

In addition to the kernels themselves, the matmul_clamp_qai8_qai8p_qsi8cxp_test.cpp is extended to perform testing of these new kernels. The testing flow for these new kernels is a bit different, in that the packing kernels themselves are not directly tested, instead only end-to-end flow is tested.

Signed-off-by: Emil Ohlsson emil.ohlsson@arm.com Signed-off-by: Felix Thomasmathibalan felixjohnny.thomasmathibalan@arm.com Signed-off-by: Mohammed Suhail Munshi MohammedSuhail.Munshi@arm.com

Edited by Emil Ohlsson

Merge request reports

Loading