Add QAI8 IGEMM kernels
This change introduces three new kernels:
- kai_imatmul_clamp_qai8_qai8p2vlx4_qsi8cxpsb2vlx4_2vlx2vl_sme2_mopa
- kai_lhs_imatmul_pack_x8p2vlx4_x8p_sme
- kai_rhs_imatmul_pack_kxn_qsi8cxp2vlx4sb_qs8cx_f32_i32_sme
These kernels are used for indirect matmul. The big difference between
these kernels and matmul kernels is that the LHS packing kernel takes an
indirection buffer where each pointer refers to a chunk in K dimension.
The pointers are laid out in a packed manner, where instead of being in
row major order, a column of get_m_step
chunk pointers are placed
linearly in indirection buffer.
In addition to the kernels themselves, the
matmul_clamp_qai8_qai8p_qsi8cxp_test.cpp
is extended to perform
testing of these new kernels. The testing flow for these new kernels is
a bit different, in that the packing kernels themselves are not directly
tested, instead only end-to-end flow is tested.
Signed-off-by: Emil Ohlsson emil.ohlsson@arm.com Signed-off-by: Felix Thomasmathibalan felixjohnny.thomasmathibalan@arm.com Signed-off-by: Mohammed Suhail Munshi MohammedSuhail.Munshi@arm.com