summaryrefslogtreecommitdiff
path: root/configure.ac
diff options
context:
space:
mode:
authorJussi Kivilinna <jussi.kivilinna@iki.fi>2013-10-26 15:00:48 +0300
committerJussi Kivilinna <jussi.kivilinna@iki.fi>2013-10-28 16:12:19 +0200
commit5a3d43485efdc09912be0967ee0a3ce345b3b15a (patch)
treeff8e937e2d010ae8e015707f5665915dabe1e915 /configure.ac
parente214e8392671dd30e9c33260717b5e756debf3bf (diff)
downloadlibgcrypt-5a3d43485efdc09912be0967ee0a3ce345b3b15a.tar.gz
Add AMD64 assembly implementation of Salsa20
* cipher/Makefile.am: Add 'salsa20-amd64.S'. * cipher/salsa20-amd64.S: New. * cipher/salsa20.c (USE_AMD64): New macro. [USE_AMD64] (_gcry_salsa20_amd64_keysetup, _gcry_salsa20_amd64_ivsetup) (_gcry_salsa20_amd64_encrypt_blocks): New prototypes. [USE_AMD64] (salsa20_keysetup, salsa20_ivsetup, salsa20_core): New. [!USE_AMD64] (salsa20_core): Change 'src' to non-constant, update block counter in 'salsa20_core' and return burn stack depth. [!USE_AMD64] (salsa20_keysetup, salsa20_ivsetup): New. (salsa20_do_setkey): Move generic key setup to 'salsa20_keysetup'. (salsa20_setkey): Fix burn stack depth. (salsa20_setiv): Move generic IV setup to 'salsa20_ivsetup'. (salsa20_do_encrypt_stream) [USE_AMD64]: Process large buffers in AMD64 implementation. (salsa20_do_encrypt_stream): Move stack burning to this function... (salsa20_encrypt_stream, salsa20r12_encrypt_stream): ...from these functions. * configure.ac [x86-64]: Add 'salsa20-amd64.lo'. -- Patch adds fast AMD64 assembly implementation for Salsa20. This implementation is based on public domain code by D. J. Bernstein and it is available at http://cr.yp.to/snuffle.html (amd64-xmm6). Implementation gains extra speed by processing four blocks in parallel with help SSE2 instructions. Benchmark results on Intel Core i5-4570 (3.2 Ghz): Before: SALSA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 3.88 ns/B 246.0 MiB/s 12.41 c/B STREAM dec | 3.88 ns/B 246.0 MiB/s 12.41 c/B = SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 2.46 ns/B 387.9 MiB/s 7.87 c/B STREAM dec | 2.46 ns/B 387.7 MiB/s 7.87 c/B After: SALSA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 0.985 ns/B 967.8 MiB/s 3.15 c/B STREAM dec | 0.987 ns/B 966.5 MiB/s 3.16 c/B = SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 0.636 ns/B 1500.5 MiB/s 2.03 c/B STREAM dec | 0.636 ns/B 1499.2 MiB/s 2.04 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Diffstat (limited to 'configure.ac')
-rw-r--r--configure.ac7
1 files changed, 7 insertions, 0 deletions
diff --git a/configure.ac b/configure.ac
index 5b7ba0d8..114460c2 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1553,6 +1553,13 @@ LIST_MEMBER(salsa20, $enabled_ciphers)
if test "$found" = "1" ; then
GCRYPT_CIPHERS="$GCRYPT_CIPHERS salsa20.lo"
AC_DEFINE(USE_SALSA20, 1, [Defined if this module should be included])
+
+ case "${host}" in
+ x86_64-*-*)
+ # Build with the assembly implementation
+ GCRYPT_CIPHERS="$GCRYPT_CIPHERS salsa20-amd64.lo"
+ ;;
+ esac
fi
LIST_MEMBER(gost28147, $enabled_ciphers)