Add bf16 Lhs/Rhs packed gemm kernels and packing functions (9783a5a7) · Commits · Kleidi / KleidiAI

Commit 9783a5a7 authored Oct 21, 2024 by Gunes Bayir Committed by Felix Johnny Thomasmathibalan Oct 21, 2024

Add bf16 Lhs/Rhs packed gemm kernels and packing functions



This commit
  - Adds bf16 x bf16 = fp32 matmul microkernel with 8x12 output block size
  - Lhs/Rhs packing functions that packs and converts the inputs from fp32 to bf16
  - Corresponding tests, and modifications to the testing framework, and reference implementation  

Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>

Reviewed-by: Felix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
Reviewed-by: Anton Bondarenko <anton.bondarenko@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Approved-by: Gian Marco Iodice <gianmarco.iodice@arm.com>

parent d590d826

Pipeline #15149 passed with stages

in 4 minutes and 24 seconds

Hide whitespace changes

Inline Side-by-side

Please register or to comment