This patch improves performance for asin and acos.
The previous code calculates asin and acos via atan. The new code calculates them directly using a kernel function for asin. The performance improvement is 100% for asin_u10, 20% for asin_u35, 90% for acos_u10, and 50% for acos_u35. This patch also includes changes for trig functions that reduces register pressure and theoretically imrpoves performance, although I cannot see performance increase in the benchmark results.
Loading
Please register or sign in to comment