AVX-optimized sin(), cos(), exp() and log() functions

avx_mathfun.h is an header file implementing AVX-optimized trigonometric, exponential and logarithmic functions.
This is actually an AVX translation of the SSE2 implementation developed by Julien Pommier, to which I refer for the implementation details.

The code was developed and tested on a AVX processor (Intel® Xeon® CPU E31245 @ 3.30GHz) using Linux and a gcc 4.6.
It still uses SSE2 functions for the integer functions which are not part of the AVX standard, but it should be ready for AVX2 processors (however, I had no computer available where I could test it).

If you have an AVX2 capable processor and a compiler implementing the intrinsics you might want to change all the occurrences of _mm256_and_si128 and _mm256_andnot_si128 with _mm256_and_si256 and _mm256_andnot_si256, respectively. (This was tested with gcc 4.8.2, clang 3.3, icc 14.0.1, or later revisions. Thanks to Thomas Szczarkowski.)

As a consequence, the performance is not exactly twice as that of the SSE2 version, although my tests show gains close to 1.7x.
AVX2 processors should probably be closer to the expected 2x speedup.

The principal functions provided by avx_mathfun.h are:

You are welcome to drop me a line at garberoglio [AT] fbk [DOT] eu for any comment or suggestion.

[ LISC software homepage | LISC homepage ]