Age | Commit message (Collapse) | Author | Files | Lines |
|
* cipher/salsa20-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/salsa20.c (USE_AMD64): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
[USE_AMD64] (ASM_FUNC_ABI, ASM_EXTRA_STACK): New.
(_gcry_salsa20_amd64_keysetup, _gcry_salsa20_amd64_ivsetup)
(_gcry_salsa20_amd64_encrypt_blocks): Add ASM_FUNC_ABI.
[USE_AMD64] (salsa20_core): Add ASM_EXTRA_STACK.
(salsa20_do_encrypt_stream) [USE_AMD64]: Add ASM_EXTRA_STACK.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
cipher/blowfish-amd64.S: Change utf-8 encoded copyright character to
'(C)'.
cipher/blowfish-arm.S: Ditto.
cipher/bufhelp.h: Ditto.
cipher/camellia-aesni-avx-amd64.S: Ditto.
cipher/camellia-aesni-avx2-amd64.S: Ditto.
cipher/camellia-arm.S: Ditto.
cipher/cast5-amd64.S: Ditto.
cipher/cast5-arm.S: Ditto.
cipher/cipher-ccm.c: Ditto.
cipher/cipher-cmac.c: Ditto.
cipher/cipher-gcm.c: Ditto.
cipher/cipher-selftest.c: Ditto.
cipher/cipher-selftest.h: Ditto.
cipher/mac-cmac.c: Ditto.
cipher/mac-gmac.c: Ditto.
cipher/mac-hmac.c: Ditto.
cipher/mac-internal.h: Ditto.
cipher/mac.c: Ditto.
cipher/rijndael-amd64.S: Ditto.
cipher/rijndael-arm.S: Ditto.
cipher/salsa20-amd64.S: Ditto.
cipher/salsa20-armv7-neon.S: Ditto.
cipher/serpent-armv7-neon.S: Ditto.
cipher/serpent-avx2-amd64.S: Ditto.
cipher/serpent-sse2-amd64.S: Ditto.
--
Avoid use of '©' for easier parsing of source for copyright information.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/salsa20-amd64.S: Rename '._labels' to '.L_labels'.
* cipher/salsa20-armv7-neon.S: Ditto.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/Makefile.am: Add 'salsa20-amd64.S'.
* cipher/salsa20-amd64.S: New.
* cipher/salsa20.c (USE_AMD64): New macro.
[USE_AMD64] (_gcry_salsa20_amd64_keysetup, _gcry_salsa20_amd64_ivsetup)
(_gcry_salsa20_amd64_encrypt_blocks): New prototypes.
[USE_AMD64] (salsa20_keysetup, salsa20_ivsetup, salsa20_core): New.
[!USE_AMD64] (salsa20_core): Change 'src' to non-constant, update block
counter in 'salsa20_core' and return burn stack depth.
[!USE_AMD64] (salsa20_keysetup, salsa20_ivsetup): New.
(salsa20_do_setkey): Move generic key setup to 'salsa20_keysetup'.
(salsa20_setkey): Fix burn stack depth.
(salsa20_setiv): Move generic IV setup to 'salsa20_ivsetup'.
(salsa20_do_encrypt_stream) [USE_AMD64]: Process large buffers in AMD64
implementation.
(salsa20_do_encrypt_stream): Move stack burning to this function...
(salsa20_encrypt_stream, salsa20r12_encrypt_stream): ...from these
functions.
* configure.ac [x86-64]: Add 'salsa20-amd64.lo'.
--
Patch adds fast AMD64 assembly implementation for Salsa20. This implementation
is based on public domain code by D. J. Bernstein and it is available at
http://cr.yp.to/snuffle.html (amd64-xmm6). Implementation gains extra speed
by processing four blocks in parallel with help SSE2 instructions.
Benchmark results on Intel Core i5-4570 (3.2 Ghz):
Before:
SALSA20 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 3.88 ns/B 246.0 MiB/s 12.41 c/B
STREAM dec | 3.88 ns/B 246.0 MiB/s 12.41 c/B
=
SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 2.46 ns/B 387.9 MiB/s 7.87 c/B
STREAM dec | 2.46 ns/B 387.7 MiB/s 7.87 c/B
After:
SALSA20 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 0.985 ns/B 967.8 MiB/s 3.15 c/B
STREAM dec | 0.987 ns/B 966.5 MiB/s 3.16 c/B
=
SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 0.636 ns/B 1500.5 MiB/s 2.03 c/B
STREAM dec | 0.636 ns/B 1499.2 MiB/s 2.04 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|