diff options
author | Jussi Kivilinna <jussi.kivilinna@iki.fi> | 2013-10-26 15:00:48 +0300 |
---|---|---|
committer | Jussi Kivilinna <jussi.kivilinna@iki.fi> | 2013-10-28 16:12:19 +0200 |
commit | 5a3d43485efdc09912be0967ee0a3ce345b3b15a (patch) | |
tree | ff8e937e2d010ae8e015707f5665915dabe1e915 /cipher/Makefile.am | |
parent | e214e8392671dd30e9c33260717b5e756debf3bf (diff) | |
download | libgcrypt-5a3d43485efdc09912be0967ee0a3ce345b3b15a.tar.gz |
Add AMD64 assembly implementation of Salsa20
* cipher/Makefile.am: Add 'salsa20-amd64.S'.
* cipher/salsa20-amd64.S: New.
* cipher/salsa20.c (USE_AMD64): New macro.
[USE_AMD64] (_gcry_salsa20_amd64_keysetup, _gcry_salsa20_amd64_ivsetup)
(_gcry_salsa20_amd64_encrypt_blocks): New prototypes.
[USE_AMD64] (salsa20_keysetup, salsa20_ivsetup, salsa20_core): New.
[!USE_AMD64] (salsa20_core): Change 'src' to non-constant, update block
counter in 'salsa20_core' and return burn stack depth.
[!USE_AMD64] (salsa20_keysetup, salsa20_ivsetup): New.
(salsa20_do_setkey): Move generic key setup to 'salsa20_keysetup'.
(salsa20_setkey): Fix burn stack depth.
(salsa20_setiv): Move generic IV setup to 'salsa20_ivsetup'.
(salsa20_do_encrypt_stream) [USE_AMD64]: Process large buffers in AMD64
implementation.
(salsa20_do_encrypt_stream): Move stack burning to this function...
(salsa20_encrypt_stream, salsa20r12_encrypt_stream): ...from these
functions.
* configure.ac [x86-64]: Add 'salsa20-amd64.lo'.
--
Patch adds fast AMD64 assembly implementation for Salsa20. This implementation
is based on public domain code by D. J. Bernstein and it is available at
http://cr.yp.to/snuffle.html (amd64-xmm6). Implementation gains extra speed
by processing four blocks in parallel with help SSE2 instructions.
Benchmark results on Intel Core i5-4570 (3.2 Ghz):
Before:
SALSA20 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 3.88 ns/B 246.0 MiB/s 12.41 c/B
STREAM dec | 3.88 ns/B 246.0 MiB/s 12.41 c/B
=
SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 2.46 ns/B 387.9 MiB/s 7.87 c/B
STREAM dec | 2.46 ns/B 387.7 MiB/s 7.87 c/B
After:
SALSA20 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 0.985 ns/B 967.8 MiB/s 3.15 c/B
STREAM dec | 0.987 ns/B 966.5 MiB/s 3.16 c/B
=
SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 0.636 ns/B 1500.5 MiB/s 2.03 c/B
STREAM dec | 0.636 ns/B 1499.2 MiB/s 2.04 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Diffstat (limited to 'cipher/Makefile.am')
-rw-r--r-- | cipher/Makefile.am | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/cipher/Makefile.am b/cipher/Makefile.am index d7db9337..e786713e 100644 --- a/cipher/Makefile.am +++ b/cipher/Makefile.am @@ -71,7 +71,7 @@ md5.c \ rijndael.c rijndael-tables.h rijndael-amd64.S rijndael-arm.S \ rmd160.c \ rsa.c \ -salsa20.c \ +salsa20.c salsa20-amd64.S \ scrypt.c \ seed.c \ serpent.c serpent-sse2-amd64.S serpent-avx2-amd64.S \ |