From a5c2bbfe0db515d739ab683297903c77b1eec124 Mon Sep 17 00:00:00 2001 From: Jussi Kivilinna Date: Tue, 17 Dec 2013 15:35:38 +0200 Subject: Add AVX and AVX2/BMI implementations for SHA-256 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * LICENSES: Add 'cipher/sha256-avx-amd64.S' and 'cipher/sha256-avx2-bmi2-amd64.S'. * cipher/Makefile.am: Add 'sha256-avx-amd64.S' and 'sha256-avx2-bmi2-amd64.S'. * cipher/sha256-avx-amd64.S: New. * cipher/sha256-avx2-bmi2-amd64.S: New. * cipher/sha256-ssse3-amd64.S: Use 'lea' instead of 'add' in few places for tiny speed improvement. * cipher/sha256.c (USE_AVX, USE_AVX2): New. (SHA256_CONTEXT) [USE_AVX, USE_AVX2]: Add 'use_avx' and 'use_avx2'. (sha256_init, sha224_init) [USE_AVX, USE_AVX2]: Initialize above new context members. [USE_AVX] (_gcry_sha256_transform_amd64_avx): New. [USE_AVX2] (_gcry_sha256_transform_amd64_avx2): New. (transform) [USE_AVX2]: Use AVX2 assembly if enabled. (transform) [USE_AVX]: Use AVX assembly if enabled. * configure.ac: Add 'sha256-avx-amd64.lo' and 'sha256-avx2-bmi2-amd64.lo'. -- Patch adds fast AVX and AVX2/BMI2 implementations of SHA-256 by Intel Corporation. The assembly source is licensed under 3-clause BSD license, thus compatible with LGPL2.1+. Original source can be accessed at: http://www.intel.com/p/en_US/embedded/hwsw/technology/packet-processing#docs Implementation is described in white paper "Fast SHA - 256 Implementations on IntelĀ® Architecture Processors" http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/sha-256-implementations-paper.html Note: AVX implementation uses SHLD instruction to emulate RORQ, since it's faster on Intel Sandy-Bridge. However, on non-Intel CPUs SHLD is much slower than RORQ, so therefore AVX implementation is (for now) limited to Intel CPUs. Note: AVX2 implementation also uses BMI2 instruction rorx, thus additional HWF flag. Benchmarks: cpu C-lang SSSE3 AVX/AVX2 C vs AVX/AVX2 vs SSSE3 Intel i5-4570 13.86 c/B 10.27 c/B 8.70 c/B 1.59x 1.18x Intel i5-2450M 17.25 c/B 12.36 c/B 10.31 c/B 1.67x 1.19x Signed-off-by: Jussi Kivilinna --- LICENSES | 2 ++ 1 file changed, 2 insertions(+) (limited to 'LICENSES') diff --git a/LICENSES b/LICENSES index 8594cfd8..6c09e1fe 100644 --- a/LICENSES +++ b/LICENSES @@ -12,6 +12,8 @@ with any binary distributions derived from the GNU C Library. * BSD_3Clause For files: + - cipher/sha256-avx-amd64.S + - cipher/sha256-avx2-bmi2-amd64.S - cipher/sha256-ssse3-amd64.S - cipher/sha512-avx-amd64.S - cipher/sha512-avx2-bmi2-amd64.S -- cgit v1.2.1