GaussianBlur CustomSigma calculate faster in 16 bits
Gaussian Blur 15x15 kernels always use the custom sigma variant, with much better performance and near the same accuracy. Custom Sigma kernels are uniformized and simplified using std::reference_wrapper in the SVE variant.