Skip to content
Commit 885699df authored by Michael Kozlov's avatar Michael Kozlov Committed by Felix Johnny Thomasmathibalan
Browse files

Optimize F32 <- QAI8DXP 1x8 (LHS) x QSI4C32P 8x8 (RHS) for 1x8 sdot



- Add new assembly ukernel optimized with FEAT_DOTPROD for matrix multiplication with 1x8 block size.
- Update build script.
- Add to unit test.

Signed-off-by: Michael Kozlov's avatarMichael Kozlov <michael.kozlov@arm.com>

Reviewed-by: Felix Johnny Thomasmathibalan's avatarFelix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
Approved-by: Felix Johnny Thomasmathibalan's avatarFelix Johnny Thomasmathibalan <felixjohnny.thomasmathibalan@arm.com>
parent 8e3073c9
Loading
Loading
Loading
Pipeline #22938 passed with stages
in 5 minutes and 15 seconds
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment