Improve packing performance for quantized Int4 per-block
Improves performance of ‘kai_rhs_pack_nxk_qsi4c32pnrx8_qsu4c32s1s0_neon’ by vectorizing row summation Signed-off-by:Evie Wright <evie.wright@arm.com> Approved-by:
Anton Bondarenko <anton.bondarenko@arm.com>
parent
d18f620a
Loading
Loading
Pipeline
#27629
passed
with stages
in
8 minutes and 20 seconds
Loading
Please register or sign in to comment