Skip to content
  1. Feb 12, 2019
    • Naoki Shibata's avatar
      [LIBM] Introduce faster method for evaluating polynomials (#239) · ca4fd109
      Naoki Shibata authored
      This patch replaces Horner method which was used to evaluate polynomials with Estrin's method( https://en.wikipedia.org/wiki/Estrin%27s_scheme ) that allows more parallel computations with out-of-order execution.
      This patch also introducing a new reduction method to tan.
      With this patch, mainly computation for double-precision functions becomes faster, and the effect is like a few percent to 20 percent. For example, the ratio between execution time of the following functions before and after applying this patch is shown below.
      
      Sleef_atan2d4_u35 : 1.21
      Sleef_powd4_u10 : 1.17
      Sleef_sind4_u35 : 1.10
      Sleef_tand4_u10 : 1.04
      Sleef_tand4_u35 : 1.17
      ca4fd109
  2. Jan 29, 2019
  3. Jan 24, 2019
    • Naoki Shibata's avatar
      Add quadprecision math library (#235) · a0537162
      Naoki Shibata authored
      This is a part of implementation of issue #233 ( https://github.com/shibatch/sleef/issues/233 ).
      At this point, add, mul, div and sqrt with testers are implemented. Remaining functions will be committed in the succeeding PRs.
      As for vector extensions, SSE2, AVX, FMA4, AVX2, AV2_128, AVX512F, AdvSIMD and SVE are supported.
      
      This quad-precision math library is built only if -DBUILD_QUAD option is given to cmake. For some time(1 year?), this sub-project is positioned at alpha development stage.
      a0537162
  4. Jan 23, 2019
    • Francesco Petrogalli's avatar
      [CI] Fix configuration · 8e6e52f2
      Francesco Petrogalli authored
      1. `-march=armv8-a+simd` is removed as it is not necessary (#232)
      2. Delete output that is never generated (#231)
      
      It also includes changes of CI setting for removing GCC/OSX testing on travis. This is because updating gcc with brew takes too much time now. Instead of this, build with gcc is now tested on Jenkins.
      8e6e52f2
    • Naoki Shibata's avatar
      no message · 79df29e3
      Naoki Shibata authored
      79df29e3
  5. Oct 23, 2018
  6. Oct 22, 2018
  7. Oct 15, 2018
  8. Oct 11, 2018
  9. Oct 08, 2018
    • Naoki Shibata's avatar
      Fix tester (#226) · a4fb670f
      Naoki Shibata authored
      I found a bug of tester in denormal/nonnumber handling of functions with two arguments.
      This patch fixes that bug.
      There is no change in the library itself.
      a4fb670f
  10. Sep 10, 2018
  11. Sep 01, 2018
  12. Aug 31, 2018
  13. Aug 29, 2018
    • Naoki Shibata's avatar
      [Determinism] Add deterministic functions (#216) · 3998463e
      Naoki Shibata authored
      This patch adds implementations of deterministic functions.
      
      The SIMD source files(sleefsimd?p.c) are compiled twice for each vector extension, with DETERMINISTIC macro turned on and off.
      Renaming by rename*.h is switched according to DETERMINISTIC macro.
      
      When DETERMINISTIC macro is undefined, the function name xsin will be renamed to Sleef_sind2_u35sse2 with renamesse2.h, for example.
      
      If DETERMINISTIC macro is defined, the function name xsin will be renamed to Sleef_cinz_sind2_u35sse2, for example.
      
      iuty* and tester2y* are added in order to test the newly added deterministic functions. As a consequence, time for testing is increased to almost two times.
      3998463e
  14. Aug 22, 2018
  15. Aug 19, 2018
  16. Aug 18, 2018
  17. Aug 17, 2018
  18. Aug 16, 2018
    • Naoki Shibata's avatar
      Add NEON32+VFPV4 helper (#211) · a266568a
      Naoki Shibata authored
      This patch adds NEON32+VFPV4 helper which has FMA support.
      
      VFPV4 is supported on most of new 32-bit ARM CPUs, and the computation of some functions is much faster.
      This patch does not include a dispatcher.
      a266568a
Loading