Fix segmentation faults in benchmark tool
* Fix incorrect calculation of LHS matrix stride value For kernels that use the LHS matrix stride in their API, namely `kai_matmul_clamp_f32_f32_f32p8x1biasf32_6x8x4_neon_mla` and `kai_matmul_clamp_f16_f16_f16p16x1biasf16_6x16x8_neon_mla` kernels, the LHS stride value was calculated incorrectly by computing in terms of bits, not bytes. * Fix insufficient allocation of memory for SME kernels For SME kernels, such as `kai_matmul_clamp_f32_f32_f32p16vlx1b_1x16vl_sme2_mla`, the tensor sizes are in terms of the streaming SVE vector length. Thus, when running SME kernels we must scale the LHS/RHS/DST buffer sizes by the VL appropriately. The segmentation faults were discovered when running with address sanitizer enabled. Signed-off-by:Jakub Sujak <jakub.sujak@arm.com> Reviewed-by:
Emil Ohlsson <emil.ohlsson@arm.com> Approved-by:
Emil Ohlsson <emil.ohlsson@arm.com>
parent
aa5bd2bf
Loading
Loading
Pipeline
#26019
passed
with stages
in
8 minutes and 56 seconds
Loading
Please register or sign in to comment