GCM VAES/VPCLMULQDQ performance improvements for large buffers (issue #35)
- GCM_BIG_DATA compile flag added - disabled by default, number of ghash keys is 48 (key structure 1,152 bytes) - when ebaled, number of ghash keys is 128 (key sturcture 2,432 bytes) - precomputing 128 or 48 GHASH keys (the structure is much bigger now) - When GCM_BIG_DATA is on and data >= 2,048 bytes reduction is done every 128 blocks - for data >= 768 bytes reduction is done every 48 blocks - for other cases reduction is done every 8 blocks - added new macro handling 16 blocks of AES and GHASH in parallel - pipeline depth is 32 blocks - very large and large buffers leverage the macro - initial N x 16 blocks macro implemented - pipelines cipher with GHASH - first runs cipher only as defined by depth of the pipeline - then runs stitched cipher and GHASH for the remaining number of blocks - parallel cipher and ghash N x 16 blocks implemented - cipher and ghash always stitched - reduction done as defined by maximum number of blocks - depth of the pipeline maintained as in initial N x 16 macro - cipher is ahead of ghash by 32 blocks - stack frame created to keep up to 128 blocks of cipher text - stack frame loads/stores are aligned - gcm key data structure definition made more generic - VX512STR and VX512LDR macros changed from vmovdqu64 to vmovdqu8 in order to work correctly with masked operations Change-Id: Idd83b911c9257bbd221c66ddd9297a4f2ae120c2
Loading
Please register or sign in to comment