|
* cipher/Makefile.am: Add 'sha1-avx-amd64.S' and
'sha1-avx-bmi2-amd64.S'.
* cipher/sha1-avx-amd64.S: New.
* cipher/sha1-avx-bmi2-amd64.S: New.
* cipher/sha1.c (USE_AVX, USE_BMI2): New.
(SHA1_CONTEXT) [USE_AVX]: Add 'use_avx'.
(SHA1_CONTEXT) [USE_BMI2]: Add 'use_bmi2'.
(sha1_init): Initialize 'use_avx' and 'use_bmi2'.
[USE_AVX] (_gcry_sha1_transform_amd64_avx): New.
[USE_BMI2] (_gcry_sha1_transform_amd64_bmi2): New.
(transform) [USE_BMI2]: Use BMI2 assembly if enabled.
(transform) [USE_AVX]: Use AVX assembly if enabled.
* configure.ac: Add 'sha1-avx-amd64.lo' and 'sha1-avx-bmi2-amd64.lo'.
--
Patch adds AVX (for Sandybridge and Ivybridge) and AVX/BMI2 (for Haswell)
optimized implementations of SHA-1.
Note: AVX implementation is currently limited to Intel CPUs due to use
of SHLD instruction for faster rotations on Sandybrigde.
Benchmarks:
cpu C-version SSSE3 AVX/(SHLD|BMI2) New vs C New vs SSSE3
Intel i5-4570 8.84 c/B 4.61 c/B 3.86 c/B 2.29x 1.19x
Intel i5-2450M 9.45 c/B 5.30 c/B 4.39 c/B 2.15x 1.20x
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|