lib: fix Windows parameter handling in LUT function
The 16x8-bit parallel lookup functions have two parameters: a pointer to the LUT and __m128i containing the indices. On Linux, this __m128i is passed through xmm0, but on Windows, this is stored in the stack and a pointer to this address is passed through a GP register. Therefore, this memory must be read into xmm0, so it is compatible with the Linux implementation. This commit fixes issue #59.
Loading
Please register or sign in to comment