Age | Commit message (Collapse) | Author | Files | Lines |
|
* cipher/salsa20.c (selftest): Ensure 16-byte alignment for salsa20
context structure.
--
Reported-by: Carlos J Puga Medina <cpm@fbsd.es>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/salsa20-amd64.S: Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
(ELF): New macro to mask lines with ELF specific commands.
* cipher/salsa20.c (USE_AMD64): Enable when
HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS defined.
[USE_AMD64] (ASM_FUNC_ABI, ASM_EXTRA_STACK): New.
(_gcry_salsa20_amd64_keysetup, _gcry_salsa20_amd64_ivsetup)
(_gcry_salsa20_amd64_encrypt_blocks): Add ASM_FUNC_ABI.
[USE_AMD64] (salsa20_core): Add ASM_EXTRA_STACK.
(salsa20_do_encrypt_stream) [USE_AMD64]: Add ASM_EXTRA_STACK.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/salsa20.c (USE_ARM_NEON_ASM): Define only if
ENABLE_NEON_SUPPORT is defined.
* cipher/serpent.c (USE_NEON): Ditto.
* cipher/sha1.c (USE_NEON): Ditto.
* cipher/sha512.c (USE_ARM_NEON_ASM): Ditto.
--
The generic C source files must only include NEON support if that is
enabled. The dedicated ASM files are conditionally compiled and thus
do not need to use it.
GnuPG-bug-id: 1603
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
* cipher/arcfour.c (do_encrypt_stream, encrypt_stream): Use 'size_t'
for buffer lengths.
* cipher/blowfish.c (_gcry_blowfish_ctr_enc, _gcry_blowfish_cbc_dec)
(_gcry_blowfish_cfb_dec): Ditto.
* cipher/camellia-glue.c (_gcry_camellia_ctr_enc)
(_gcry_camellia_cbc_dec, _gcry_blowfish_cfb_dec): Ditto.
* cipher/cast5.c (_gcry_cast5_ctr_enc, _gcry_cast5_cbc_dec)
(_gcry_cast5_cfb_dec): Ditto.
* cipher/cipher-aeswrap.c (_gcry_cipher_aeswrap_encrypt)
(_gcry_cipher_aeswrap_decrypt): Ditto.
* cipher/cipher-cbc.c (_gcry_cipher_cbc_encrypt)
(_gcry_cipher_cbc_decrypt): Ditto.
* cipher/cipher-ccm.c (_gcry_cipher_ccm_encrypt)
(_gcry_cipher_ccm_decrypt): Ditto.
* cipher/cipher-cfb.c (_gcry_cipher_cfb_encrypt)
(_gcry_cipher_cfb_decrypt): Ditto.
* cipher/cipher-ctr.c (_gcry_cipher_ctr_encrypt): Ditto.
* cipher/cipher-internal.h (gcry_cipher_handle->bulk)
(_gcry_cipher_cbc_encrypt, _gcry_cipher_cbc_decrypt)
(_gcry_cipher_cfb_encrypt, _gcry_cipher_cfb_decrypt)
(_gcry_cipher_ofb_encrypt, _gcry_cipher_ctr_encrypt)
(_gcry_cipher_aeswrap_encrypt, _gcry_cipher_aeswrap_decrypt)
(_gcry_cipher_ccm_encrypt, _gcry_cipher_ccm_decrypt): Ditto.
* cipher/cipher-ofb.c (_gcry_cipher_cbc_encrypt): Ditto.
* cipher/cipher-selftest.h (gcry_cipher_bulk_cbc_dec_t)
(gcry_cipher_bulk_cfb_dec_t, gcry_cipher_bulk_ctr_enc_t): Ditto.
* cipher/cipher.c (cipher_setkey, cipher_setiv, do_ecb_crypt)
(do_ecb_encrypt, do_ecb_decrypt, cipher_encrypt)
(cipher_decrypt): Ditto.
* cipher/rijndael.c (_gcry_aes_ctr_enc, _gcry_aes_cbc_dec)
(_gcry_aes_cfb_dec, _gcry_aes_cbc_enc, _gcry_aes_cfb_enc): Ditto.
* cipher/salsa20.c (salsa20_setiv, salsa20_do_encrypt_stream)
(salsa20_encrypt_stream, salsa20r12_encrypt_stream): Ditto.
* cipher/serpent.c (_gcry_serpent_ctr_enc, _gcry_serpent_cbc_dec)
(_gcry_serpent_cfb_dec): Ditto.
* cipher/twofish.c (_gcry_twofish_ctr_enc, _gcry_twofish_cbc_dec)
(_gcry_twofish_cfb_dec): Ditto.
* src/cipher-proto.h (gcry_cipher_stencrypt_t)
(gcry_cipher_stdecrypt_t, cipher_setiv_fuct_t): Ditto.
* src/cipher.h (_gcry_aes_cfb_enc, _gcry_aes_cfb_dec)
(_gcry_aes_cbc_enc, _gcry_aes_cbc_dec, _gcry_aes_ctr_enc)
(_gcry_blowfish_cfb_dec, _gcry_blowfish_cbc_dec)
(_gcry_blowfish_ctr_enc, _gcry_cast5_cfb_dec, _gcry_cast5_cbc_dec)
(_gcry_cast5_ctr_enc, _gcry_camellia_cfb_dec, _gcry_camellia_cbc_dec)
(_gcry_camellia_ctr_enc, _gcry_serpent_cfb_dec, _gcry_serpent_cbc_dec)
(_gcry_serpent_ctr_enc, _gcry_twofish_cfb_dec, _gcry_twofish_cbc_dec)
(_gcry_twofish_ctr_enc): Ditto.
--
On 64-bit platforms, cipher module internally converts 64-bit size_t values
to 32-bit unsigned integers.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/Makefile.am: Add 'salsa20-armv7-neon.S'.
* cipher/salsa20-armv7-neon.S: New.
* cipher/salsa20.c [USE_ARM_NEON_ASM]: New macro.
(struct SALSA20_context_s, salsa20_core_t, salsa20_keysetup_t)
(salsa20_ivsetup_t): New.
(SALSA20_context_t) [USE_ARM_NEON_ASM]: Add 'use_neon'.
(SALSA20_context_t): Add 'keysetup', 'ivsetup' and 'core'.
(salsa20_core): Change 'src' argument to 'ctx'.
[USE_ARM_NEON_ASM] (_gcry_arm_neon_salsa20_encrypt): New prototype.
[USE_ARM_NEON_ASM] (salsa20_core_neon, salsa20_keysetup_neon)
(salsa20_ivsetup_neon): New.
(salsa20_do_setkey): Setup keysetup, ivsetup and core with default
functions.
(salsa20_do_setkey) [USE_ARM_NEON_ASM]: When NEON support detect,
set keysetup, ivsetup and core with ARM NEON functions.
(salsa20_do_setkey): Call 'ctx->keysetup'.
(salsa20_setiv): Call 'ctx->ivsetup'.
(salsa20_do_encrypt_stream) [USE_ARM_NEON_ASM]: Process large buffers
in ARM NEON implementation.
(salsa20_do_encrypt_stream): Call 'ctx->core' instead of directly
calling 'salsa20_core'.
(selftest): Add test to check large buffer processing and block counter
updating.
* configure.ac [neonsupport]: 'Add salsa20-armv7-neon.lo'.
--
Patch adds fast ARM NEON assembly implementation for Salsa20. Implementation
gains extra speed by processing three blocks in parallel with help of ARM
NEON vector processing unit.
This implementation is based on public domain code by Peter Schwabe and D. J.
Bernstein and it is available in SUPERCOP benchmarking framework. For more
details on this work, check paper "NEON crypto" by Daniel J. Bernstein and
Peter Schwabe:
http://cryptojedi.org/papers/#neoncrypto
Benchmark results on Cortex-A8 (1008 Mhz):
Before:
SALSA20 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 18.88 ns/B 50.51 MiB/s 19.03 c/B
STREAM dec | 18.89 ns/B 50.49 MiB/s 19.04 c/B
=
SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 13.60 ns/B 70.14 MiB/s 13.71 c/B
STREAM dec | 13.60 ns/B 70.13 MiB/s 13.71 c/B
After:
SALSA20 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 5.48 ns/B 174.1 MiB/s 5.52 c/B
STREAM dec | 5.47 ns/B 174.2 MiB/s 5.52 c/B
=
SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 3.65 ns/B 260.9 MiB/s 3.68 c/B
STREAM dec | 3.65 ns/B 261.6 MiB/s 3.67 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/Makefile.am: Add 'salsa20-amd64.S'.
* cipher/salsa20-amd64.S: New.
* cipher/salsa20.c (USE_AMD64): New macro.
[USE_AMD64] (_gcry_salsa20_amd64_keysetup, _gcry_salsa20_amd64_ivsetup)
(_gcry_salsa20_amd64_encrypt_blocks): New prototypes.
[USE_AMD64] (salsa20_keysetup, salsa20_ivsetup, salsa20_core): New.
[!USE_AMD64] (salsa20_core): Change 'src' to non-constant, update block
counter in 'salsa20_core' and return burn stack depth.
[!USE_AMD64] (salsa20_keysetup, salsa20_ivsetup): New.
(salsa20_do_setkey): Move generic key setup to 'salsa20_keysetup'.
(salsa20_setkey): Fix burn stack depth.
(salsa20_setiv): Move generic IV setup to 'salsa20_ivsetup'.
(salsa20_do_encrypt_stream) [USE_AMD64]: Process large buffers in AMD64
implementation.
(salsa20_do_encrypt_stream): Move stack burning to this function...
(salsa20_encrypt_stream, salsa20r12_encrypt_stream): ...from these
functions.
* configure.ac [x86-64]: Add 'salsa20-amd64.lo'.
--
Patch adds fast AMD64 assembly implementation for Salsa20. This implementation
is based on public domain code by D. J. Bernstein and it is available at
http://cr.yp.to/snuffle.html (amd64-xmm6). Implementation gains extra speed
by processing four blocks in parallel with help SSE2 instructions.
Benchmark results on Intel Core i5-4570 (3.2 Ghz):
Before:
SALSA20 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 3.88 ns/B 246.0 MiB/s 12.41 c/B
STREAM dec | 3.88 ns/B 246.0 MiB/s 12.41 c/B
=
SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 2.46 ns/B 387.9 MiB/s 7.87 c/B
STREAM dec | 2.46 ns/B 387.7 MiB/s 7.87 c/B
After:
SALSA20 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 0.985 ns/B 967.8 MiB/s 3.15 c/B
STREAM dec | 0.987 ns/B 966.5 MiB/s 3.16 c/B
=
SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte
STREAM enc | 0.636 ns/B 1500.5 MiB/s 2.03 c/B
STREAM dec | 0.636 ns/B 1499.2 MiB/s 2.04 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* src/gcrypt-module.h (gcry_cipher_spec_t): Move to ...
* src/cipher-proto.h (gcry_cipher_spec_t): here. Merge with
cipher_extra_spec_t. Add fields ALGO and FLAGS. Set these fields in
all cipher modules.
* cipher/cipher.c: Change most code to replace the former module
system by a simpler system to gain information about the algorithms.
(disable_pubkey_algo): Simplified. Not anymore thread-safe, though.
* cipher/md.c (_gcry_md_selftest): Use correct structure. Not a real
problem because both define the same function as their first field.
* cipher/pubkey.c (_gcry_pk_selftest): Take care of the disabled flag.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
* cipher/bithelp.h (bswap32, bswap64, le_bswap32, be_bswap32)
(le_bswap64, be_bswap64): New.
* cipher/bufhelp.h (buf_get_be32, buf_get_le32, buf_put_le32)
(buf_put_be32, buf_get_be64, buf_get_le64, buf_put_be64)
(buf_put_le64): New.
* cipher/blowfish.c (do_encrypt_block, do_decrypt_block): Use new
endian conversion helpers.
(do_bf_setkey): Turn endian specific code to generic.
* cipher/camellia.c (GETU32, PUTU32): Use new endian conversion
helpers.
* cipher/cast5.c (rol): Remove, use rol from bithelp.
(F1, F2, F3): Fix to use rol from bithelp.
(do_encrypt_block, do_decrypt_block, do_cast_setkey): Use new endian
conversion helpers.
* cipher/des.c (READ_64BIT_DATA, WRITE_64BIT_DATA): Ditto.
* cipher/md4.c (transform, md4_final): Ditto.
* cipher/md5.c (transform, md5_final): Ditto.
* cipher/rmd160.c (transform, rmd160_final): Ditto.
* cipher/salsa20.c (LE_SWAP32, LE_READ_UINT32): Ditto.
* cipher/scrypt.c (READ_UINT64, LE_READ_UINT64, LE_SWAP32): Ditto.
* cipher/seed.c (GETU32, PUTU32): Ditto.
* cipher/serpent.c (byte_swap_32): Remove.
(serpent_key_prepare, serpent_encrypt_internal)
(serpent_decrypt_internal): Use new endian conversion helpers.
* cipher/sha1.c (transform, sha1_final): Ditto.
* cipher/sha256.c (transform, sha256_final): Ditto.
* cipher/sha512.c (__transform, sha512_final): Ditto.
* cipher/stribog.c (transform, stribog_final): Ditto.
* cipher/tiger.c (transform, tiger_final): Ditto.
* cipher/twofish.c (INPACK, OUTUNPACK): Ditto.
* cipher/whirlpool.c (buffer_to_block, block_to_buffer): Ditto.
* configure.ac (gcry_cv_have_builtin_bswap32): Check for compiler
provided __builtin_bswap32.
(gcry_cv_have_builtin_bswap64): Check for compiler provided
__builtin_bswap64.
--
Patch add helper functions that provide conversions to/from integers and
buffers of different endianess. Benefits are code cleanup and optimization
for architectures that have byte-swaping instructions and/or can do fast
unaligned memory accesses.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* src/gcrypt.h.in (GCRY_CIPHER_SALSA20R12): New.
* src/salsa20.c (salsa20_core, salsa20_do_encrypt_stream): Add support
for reduced round versions.
(salsa20r12_encrypt_stream, _gcry_cipher_spec_salsa20r12): Implement
Salsa20/12 - a 12 round version of Salsa20 selected by eStream.
* src/cipher.h: Declsare Salsa20/12 definition.
* cipher/cipher.c: Register Salsa20/12
* tests/basic.c: (check_stream_cipher, check_stream_cipher_large_block):
Populate Salsa20/12 tests with test vectors from ecrypt
(check_ciphers): Add simple test for Salsa20/12
--
Salsa20/12 is a reduced round version of Salsa20 that is amongst ciphers
selected by eSTREAM for Phase 3 of Profile 1 algorithm. Moreover it is
one of proposed ciphers for TLS (draft-josefsson-salsa20-tls-02).
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
|
|
* src/gcrypt.h.in (GCRY_CIPHER_SALSA20): New.
* cipher/salsa20.c: New.
* configure.ac (available_ciphers): Add Salsa20.
* cipher/cipher.c: Register Salsa20.
(cipher_setiv): Allow to divert an IV to a cipher module.
* src/cipher-proto.h (cipher_setiv_func_t): New.
(cipher_extra_spec): Add field setiv.
* src/cipher.h: Declare Salsa20 definitions.
* tests/basic.c (check_stream_cipher): New.
(check_stream_cipher_large_block): New.
(check_cipher_modes): Run new test functions.
(check_ciphers): Add simple test for Salsa20.
Signed-off-by: Werner Koch <wk@gnupg.org>
|