Add fp16 in/out bf16 Gemm kernel and relevant packing functions (cd34c1bd) · Commits · Kleidi / KleidiAI

Commit cd34c1bd authored Nov 29, 2024 by Gunes Bayir Committed by Emil Ohlsson Nov 29, 2024

Add fp16 in/out bf16 Gemm kernel and relevant packing functions



This commit

* Adds bf16 x bf16 = fp16 matmul microkernel with 8x12 output block size
* Lhs/Rhs packing functions that packs and converts the inputs from fp16 to bf16
* Corresponding tests, and modifications to the testing framework, and reference implementation

Signed-off-by: Gunes Bayir <gunes.bayir@arm.com>

Reviewed-by: Viet-Hoa Do <viet-hoa.do@arm.com>
Reviewed-by: Jakub Sujak <jakub.sujak@arm.com>
Reviewed-by: Gunes Bayir <gunes.bayir@arm.com>
Reviewed-by: Emil Ohlsson <emil.ohlsson@arm.com>
Approved-by: Emil Ohlsson <emil.ohlsson@arm.com>

parent c4a22b2e

Pipeline #16670 passed with stages

in 7 minutes and 13 seconds

Hide whitespace changes

Inline Side-by-side

Please register or to comment