Skip to content

Fix segmentation faults in benchmark tool

Jakub Sujak requested to merge jakub/benchmark_bugs into main
  • Fix incorrect calculation of LHS matrix stride value

For kernels that use the LHS matrix stride in their API, namely kai_matmul_clamp_f32_f32_f32p8x1biasf32_6x8x4_neon_mla and kai_matmul_clamp_f16_f16_f16p16x1biasf16_6x16x8_neon_mla kernels, the LHS stride value was calculated incorrectly by computing in terms of bits, not bytes.

  • Fix insufficient allocation of memory for SME kernels

For SME kernels, such as kai_matmul_clamp_f32_f32_f32p16vlx1b_1x16vl_sme2_mla, the tensor sizes are in terms of the streaming SVE vector length. Thus, when running SME kernels we must scale the LHS/RHS/DST buffer sizes by the VL appropriately.

The segmentation faults were discovered when running with address sanitizer enabled.

Signed-off-by: Jakub Sujak jakub.sujak@arm.com

Merge request reports

Loading