Skip to content

Optimize kai_rhs_pack_nxk_qsi4c32p_qsu4c32s1s0 using advanced SIMD

Evie Wright requested to merge int4_perblock_vectorize into main

Optimize the transposed RHS packing function for matmul_clamp_f32_qai8dxp_qsi4c32p using advanced SIMD, for kr / sr = 8

Signed-off-by: Evie Wright evie.wright@arm.com

Signed-off-by: Anitha Raj anitha.raj@arm.com

Edited by Anitha Raj

Merge request reports

Loading