Age | Commit message (Collapse) | Author | Files | Lines |
|
* cipher/ecc.c (sign): Add args FLAGS and HASHALGO. Convert an opaque
MPI as INPUT. Implement rfc-6979.
(ecc_sign): Remove the opaque MPI code and pass FLAGS to sign.
(verify): Do not allocate and compute Y; it is not used.
(ecc_verify): Truncate the hash value if needed.
* tests/dsa-rfc6979.c (check_dsa_rfc6979): Add ECDSA test cases.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
* cipher/dsa.c (dsa_sign): Move opaque mpi extraction to sign.
(sign): Add args FLAGS and HASHALGO. Implement deterministic DSA.
Add code path for R==0 to comply with the standard.
(dsa_verify): Left fill opaque mpi based hash values.
* cipher/dsa-common.c (int2octets, bits2octets): New.
(_gcry_dsa_gen_rfc6979_k): New.
* tests/dsa-rfc6979.c: New.
* tests/Makefile.am (TESTS): Add dsa-rfc6979.
--
This patch also fixes a recent patch (37d0a1e) which allows to pass
the hash in a (hash) element.
Support for deterministic ECDSA will come soon.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
* cipher/pubkey.c (sexp_to_key): Fallback to private key.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
* mpi/mpi-pow.c (gcry_mpi_powm): Always perfrom the mpi_mul for
exponents in secure memory.
--
The attack is published as http://eprint.iacr.org/2013/448 :
Flush+Reload: a High Resolution, Low Noise, L3 Cache Side-Channel
Attack by Yuval Yarom and Katrina Falkner. 18 July 2013.
Flush+Reload is a cache side-channel attack that monitors access to
data in shared pages. In this paper we demonstrate how to use the
attack to extract private encryption keys from GnuPG. The high
resolution and low noise of the Flush+Reload attack enables a spy
program to recover over 98% of the bits of the private key in a
single decryption or signing round. Unlike previous attacks, the
attack targets the last level L3 cache. Consequently, the spy
program and the victim do not need to share the execution core of
the CPU. The attack is not limited to a traditional OS and can be
used in a virtualised environment, where it can attack programs
executing in a different VM.
(cherry picked from commit 55237c8f6920c6629debd23db65e90b42a3767de)
|
|
* cipher/pubkey.c (pubkey_sign): Add arg ctx and pass it to the sign
module.
(gcry_pk_sign): Pass CTX to pubkey_sign.
(sexp_data_to_mpi): Add flag rfc6979 and code to alls hash with *DSA
* cipher/rsa.c (rsa_sign, rsa_verify): Return an error if an opaque
MPI is given for DATA/HASH.
* cipher/elgamal.c (elg_sign, elg_verify): Ditto.
* cipher/dsa.c (dsa_sign, dsa_verify): Convert a given opaque MPI.
* cipher/ecc.c (ecc_sign, ecc_verify): Ditto.
* tests/basic.c (check_pubkey_sign_ecdsa): Add a test for using a hash
element with DSA.
--
This patch allows the use of
(data (flags raw)
(hash sha256 #80112233445566778899AABBCCDDEEFF
000102030405060708090A0B0C0D0E0F#))
in addition to the old but more efficient
(data (flags raw)
(value #80112233445566778899AABBCCDDEEFF
000102030405060708090A0B0C0D0E0F#))
for DSA and ECDSA. With the hash element the flag "raw" must be
explicitly given because existing regression test code expects that
conflict error is return if no flags but a hash element is given.
Note that the hash algorithm name is currently not checked. It may
eventually be used to cross-check the length of the provided hash
value. It is suggested that the correct hash name is given - even if
a truncated hash value is used.
Finally this patch adds a way to pass the hash algorithm and flag
values to the signing module. "rfc6979" as been implemented as a new
but not yet used flag.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
* src/sexp.c (gcry_sexp_nth_buffer): New.
* src/visibility.c, src/visibility.h: Add function wrapper.
* src/libgcrypt.vers, src/libgcrypt.def: Add to API.
* src/gcrypt.h.in: Add prototype.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
--
|
|
* src/gcrypt.h.in (GCRY_CIPHER_SALSA20): New.
* cipher/salsa20.c: New.
* configure.ac (available_ciphers): Add Salsa20.
* cipher/cipher.c: Register Salsa20.
(cipher_setiv): Allow to divert an IV to a cipher module.
* src/cipher-proto.h (cipher_setiv_func_t): New.
(cipher_extra_spec): Add field setiv.
* src/cipher.h: Declare Salsa20 definitions.
* tests/basic.c (check_stream_cipher): New.
(check_stream_cipher_large_block): New.
(check_cipher_modes): Run new test functions.
(check_ciphers): Add simple test for Salsa20.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
--
|
|
* mpi/mpicoder.c (gcry_mpi_dump): Detect abd print opaque MPIs.
* tests/mpitests.c (test_opaque): New.
(main): Call new test.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
* src/gcrypt-module.h (gcry_pk_sign_t): Add parms flags and hashalgo.
* cipher/rsa.c (rsa_sign): Add parms and mark them as unused.
* cipher/dsa.c (dsa_sign): Ditto.
* cipher/elgamal.c (elg_sign): Ditto.
* cipher/pubkey.c (dummy_sign): Ditto.
(pubkey_sign): Pass 0 for the new args.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
* mpi/mpi-pow.c (gcry_mpi_powm): For a zero exponent, make sure that
the result has been allocated.
--
This code triggered the problem:
modulus = gcry_mpi_set_ui(NULL, 100);
generator = gcry_mpi_set_ui(NULL, 3);
exponent = gcry_mpi_set_ui(NULL, 0);
result = gcry_mpi_new(0);
gcry_mpi_powm(result, generator, exponent, modulus);
gcry_mpi_new(0) does not allocate the limb space thus it is not
possible to write even into the first limb. Workaround was to use
gcry_mpi_new (1) but a real fix is better.
Reported-by: Ian Goldberg
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
--
|
|
* tests/t-mpi-point.c (basic_ec_math, basic_ec_math_simplified): add
calls to gcry_ctx_release() to free contexts after they become unused.
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
|
|
* random/rndw32.c: include winsock2.h before windows.h.
* src/ath.h [_WIN32]: Ditto.
* tests/benchmark.c [_WIN32]: Ditto.
--
Patch silences warnings of following type:
/usr/lib/gcc/i686-w64-mingw32/4.6/../../../../i686-w64-mingw32/include/winsock2.h:15:2: warning: #warning Please include winsock2.h before windows.h [-Wcpp]
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* mpi/amd64/mpih-mul2.S: remove duplicated header.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/bithelp.h [__GNUC__, __i386__] (rol, ror): add "cc" globber
for inline assembly.
* cipher/cast5.c [__GNUC__, __i386__] (rol): Ditto.
* random/rndhw.c [USE_DRNG] (rdrand_long): Ditto.
* src/hmac256.c [__GNUC__, __i386__] (ror): Ditto.
* mpi/longlong.c [__i386__] (add_ssaaaa, sub_ddmmss, umul_ppmm)
(udiv_qrnnd, count_leading_zeros, count_trailing_zeros): Ditto.
--
These assembly snippets modify cflags but do not mark "cc" clobber.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/bufhelp.h (buf_xor, buf_xor_2dst, buf_xor_n_copy): Cast
to larger element pointer through (void *) to suppress -Wcast-error.
--
Patch disables bogus warnings caused by -Wcast-error. We know that byte
pointers are properly aligned at these phases, or that hardware can handle
unaligned accesses.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* mpi/longlong.h [__arm__]: Construct __ARM_ARCH if not provided by
compiler.
--
GCC 4.8 defines __ARM_ARCH which provides forward compatible way to detect
ARM architecture. Use this when available and construct otherwise.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* mpi/longlong.h [__arm__] (add_ssaaaa, sub_ddmmss): Add __CLOBBER_CC.
[__arm__][__ARM_ARCH <= 3] (umul_ppmm): Ditto.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
mpi/longlong.h [__arm__]: Enable inline assembly if __thumb2__ is
defined.
[__arm__]: Use __ARCH_ARM when defined.
[__arm__] [__ARM_ARCH >= 5] (count_leading_zeros): New.
--
Current ARM Linux distributions use EABI that enables thumb2, and therefore
inline assembly is disable (because !defined(__thumb__) selector). However
thumb2 allows the use of assembly instructions that longlong.h contains for
ARM. So this patch enables inline assembly for ARM when __thumb2__ is defined
in addition to __thumb__.
Patch also adds optimization for count_leading_zeros() macro for ARM.
Results on Cortex-A8, 1Ghz:
===
Before:
Algorithm generate 100*sign 100*verify
------------------------------------------------
RSA 1024 bit 750ms 2780ms 110ms
RSA 2048 bit 14280ms 17250ms 300ms
RSA 3072 bit 38630ms 51300ms 650ms
RSA 4096 bit 60940ms 111430ms 1000ms
jussi@cubie:~/libgcrypt$ tests/benchmark dsa
Algorithm generate 100*sign 100*verify
------------------------------------------------
DSA 1024/160 - 1410ms 1680ms
DSA 2048/224 - 6100ms 7390ms
DSA 3072/256 - 14350ms 17120ms
jussi@cubie:~/libgcrypt$ tests/benchmark ecc
Algorithm generate 100*sign 100*verify
------------------------------------------------
ECDSA 192 bit 90ms 2160ms 3940ms
ECDSA 224 bit 110ms 2810ms 5400ms
ECDSA 256 bit 150ms 3570ms 6970ms
ECDSA 384 bit 340ms 8320ms 16420ms
ECDSA 521 bit 850ms 19760ms 38480ms
After:
jussi@cubie:~/libgcrypt$ tests/benchmark rsa
Algorithm generate 100*sign 100*verify
------------------------------------------------
RSA 1024 bit 590ms 2230ms 80ms
RSA 2048 bit 2320ms 13090ms 240ms
RSA 3072 bit 60580ms 38420ms 460ms
RSA 4096 bit 115130ms 82250ms 750ms
jussi@cubie:~/libgcrypt$ tests/benchmark dsa
Algorithm generate 100*sign 100*verify
------------------------------------------------
DSA 1024/160 - 1070ms 1290ms
DSA 2048/224 - 4500ms 5550ms
DSA 3072/256 - 10280ms 12200ms
jussi@cubie:~/libgcrypt$ tests/benchmark ecc
Algorithm generate 100*sign 100*verify
------------------------------------------------
ECDSA 192 bit 70ms 1900ms 3560ms
ECDSA 224 bit 100ms 2490ms 4750ms
ECDSA 256 bit 120ms 3140ms 5920ms
ECDSA 384 bit 270ms 6990ms 13790ms
ECDSA 521 bit 680ms 17080ms 33490ms
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* configure.ac (AH_BOTTOM): Move GPG_ERR_ replacement defines to ...
* src/gcrypt-int.h: new file.
* src/visibility.h, src/cipher.h: Replace gcrypt.h by gcrypt-int.h.
* tests/: Ditto for all test files.
--
Defining newer gpg-error codes in config.h was not a good idea,
because config.h is usually included before gpg-error.h and thus
gpg-error.h would be double defines to lead to faulty code there like
typedef enum
{
[...]
191 = 191,
[...]
};
|
|
* cipher/blowfish-amd64.S: Enable only if
HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS is defined.
* cipher/camellia-aesni-avx-amd64.S: Ditto.
* cipher/camellia-aesni-avx2-amd64.S: Ditto.
* cipher/cast5-amd64.S: Ditto.
* cipher/rinjdael-amd64.S: Ditto.
* cipher/serpent-avx2-amd64.S: Ditto.
* cipher/serpent-sse2-amd64.S: Ditto.
* cipher/twofish-amd64.S: Ditto.
* cipher/blowfish.c: Use AMD64 assembly implementation only if
HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS is defined
* cipher/camellia-glue.c: Ditto.
* cipher/cast5.c: Ditto.
* cipher/rijndael.c: Ditto.
* cipher/serpent.c: Ditto.
* cipher/twofish.c: Ditto.
* configure.ac: Check gcc/as compatibility with AMD64 assembly
implementations.
--
Later these checks can be split and assembly implementations adapted to handle
different platforms, but for now disable AMD64 assembly implementations if
assembler does not look to be able to handle them.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* src/misc.c (_gcry_burn_stack): Add optimization for 32-bit and 64-bit
architectures.
--
Busy looping 'tests/benchmark --cipher-repetitions 10 cipher blowfish' on ARM
Cortex-A8 shows that _gcry_burn_stack takes 21% of CPU time. With this patch,
that number drops to 3.4%.
On AMD64 (Intel i5-4570) CPU usage for _gcry_burn_stack in the same test drops
from 3.5% to 1.1%.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/Makefile.am: Add 'camellia-aesni-avx2-amd64.S'.
* cipher/camellia-aesni-avx2-amd64.S: New file.
* cipher/camellia-glue.c (USE_AESNI_AVX2): New macro.
(CAMELLIA_context) [USE_AESNI_AVX2]: Add 'use_aesni_avx2'.
[USE_AESNI_AVX2] (_gcry_camellia_aesni_avx2_ctr_enc)
(_gcry_camellia_aesni_avx2_cbc_dec)
(_gcry_camellia_aesni_avx2_cfb_dec): New prototypes.
(camellia_setkey) [USE_AESNI_AVX2]: Check AVX2+AES-NI capable hardware
and set 'ctx->use_aesni_avx2'.
(_gcry_camellia_ctr_enc) [USE_AESNI_AVX2]: Add AVX2 accelerated code.
(_gcry_camellia_cbc_dec) [USE_AESNI_AVX2]: Add AVX2 accelerated code.
(_gcry_camellia_cfb_dec) [USE_AESNI_AVX2]: Add AVX2 accelerated code.
(selftest_ctr_128, selftest_cbc_128, selftest_cfb_128): Grow 'nblocks'
so that AVX2 codepaths get tested.
* configure.ac (camellia) [avx2support, aesnisupport]: Add
'camellia-aesni-avx2-amd64.lo'.
--
Add new AVX2/AES-NI implementation of Camellia that processes 32 blocks in
parallel.
Speed old (AVX/AES-NI) vs. new (AVX2/AES-NI) on Intel Core i5-4570:
ECB/Stream CBC CFB OFB CTR
--------------- --------------- --------------- --------------- ---------------
CAMELLIA128 1.00x 0.99x 1.00x 1.53x 1.00x 1.49x 1.00x 1.00x 1.54x 1.54x
CAMELLIA256 0.99x 1.00x 1.00x 1.50x 1.00x 1.50x 1.00x 1.00x 1.54x 1.52x
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/Makefile.am: Add 'serpent-avx2-amd64.S'.
* cipher/serpent-avx2-amd64.S: New file.
* cipher/serpent.c (USE_AVX2): New macro.
(serpent_context_t) [USE_AVX2]: Add 'use_avx2'.
[USE_AVX2] (_gcry_serpent_avx2_ctr_enc, _gcry_serpent_avx2_cbc_dec)
(_gcry_serpent_avx2_cfb_dec): New prototypes.
(serpent_setkey_internal) [USE_AVX2]: Check for AVX2 capable hardware
and set 'use_avx2'.
(_gcry_serpent_ctr_enc) [USE_AVX2]: Use AVX2 accelerated functions.
(_gcry_serpent_cbc_dec) [USE_AVX2]: Use AVX2 accelerated functions.
(_gcry_serpent_cfb_dec) [USE_AVX2]: Use AVX2 accelerated functions.
(selftest_ctr_128, selftest_cbc_128, selftest_cfb_128): Grow 'nblocks'
so that AVX2 codepaths are tested.
* configure.ac (serpent) [avx2support]: Add 'serpent-avx2-amd64.lo'.
--
Add new AVX2 implementation of Serpent that processes 16 blocks in parallel.
Speed old (SSE2) vs. new (AVX2) on Intel Core i5-4570:
ECB/Stream CBC CFB OFB CTR
--------------- --------------- --------------- --------------- ---------------
SERPENT128 1.00x 1.00x 1.00x 2.10x 1.00x 2.16x 1.01x 1.00x 2.16x 2.18x
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* configure.ac: Add option --disable-avx2-support.
(HAVE_GCC_INLINE_ASM_AVX2): New.
(ENABLE_AVX2_SUPPORT): New.
* src/g10lib.h (HWF_INTEL_AVX2): New.
* src/global.c (hwflist): Add HWF_INTEL_AVX2.
* src/hwf-x86.c [__i386__] (get_cpuid): Initialize registers to zero
before cpuid.
[__x86_64__] (get_cpuid): Initialize registers to zero before cpuid.
(detect_x86_gnuc): Store maximum cpuid level.
(detect_x86_gnuc) [ENABLE_AVX2_SUPPORT]: Add detection for AVX2.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/Makefile.am: Add 'twofish-amd64.S'.
* cipher/twofish-amd64.S: New file.
* cipher/twofish.c (USE_AMD64_ASM): New macro.
[USE_AMD64_ASM] (_gcry_twofish_amd64_encrypt_block)
(_gcry_twofish_amd64_decrypt_block, _gcry_twofish_amd64_ctr_enc)
(_gcry_twofish_amd64_cbc_dec, _gcry_twofish_amd64_cfb_dec): New
prototypes.
[USE_AMD64_ASM] (do_twofish_encrypt, do_twofish_decrypt)
(twofish_encrypt, twofish_decrypt): New functions.
(_gcry_twofish_ctr_enc, _gcry_twofish_cbc_dec, _gcry_twofish_cfb_dec)
(selftest_ctr, selftest_cbc, selftest_cfb): New functions.
(selftest): Call new bulk selftests.
* cipher/cipher.c (gcry_cipher_open) [USE_TWOFISH]: Register Twofish
bulk functions for ctr-enc, cbc-dec and cfb-dec.
* configure.ac (twofish) [x86_64]: Add 'twofish-amd64.lo'.
* src/cipher.h (_gcry_twofish_ctr_enc, _gcry_twofish_cbc_dec)
(gcry_twofish_cfb_dec): New prototypes.
--
Provides non-parallel implementations for small speed-up and 3-way parallel
implementations that gets accelerated on `out-of-order' CPUs.
Speed old vs. new on Intel Core i5-4570:
ECB/Stream CBC CFB OFB CTR
--------------- --------------- --------------- --------------- ---------------
TWOFISH128 1.08x 1.07x 1.10x 1.80x 1.09x 1.70x 1.08x 1.08x 1.70x 1.69x
Speed old vs. new on Intel Core2 T8100:
ECB/Stream CBC CFB OFB CTR
--------------- --------------- --------------- --------------- ---------------
TWOFISH128 1.11x 1.10x 1.13x 1.65x 1.13x 1.62x 1.12x 1.11x 1.63x 1.59x
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/Makefile.am: Add 'rijndael-amd64.S'.
* cipher/rijndael-amd64.S: New file.
* cipher/rijndael.c (USE_AMD64_ASM): New macro.
[USE_AMD64_ASM] (_gcry_aes_amd64_encrypt_block)
(_gcry_aes_amd64_decrypt_block): New prototypes.
(do_encrypt_aligned) [USE_AMD64_ASM]: Use amd64 assembly function.
(do_encrypt): Disable input/output alignment when USE_AMD64_ASM is set.
(do_decrypt_aligned) [USE_AMD64_ASM]: Use amd64 assembly function.
(do_decrypt): Disable input/output alignment when USE_AMD64_AES is set.
* configure.ac (aes) [x86-64]: Add 'rijndael-amd64.lo'.
--
Add optimized amd64 assembly implementation for AES.
Old vs new, on AMD Phenom II:
ECB/Stream CBC CFB OFB CTR
--------------- --------------- --------------- --------------- ---------------
AES 1.74x 1.72x 1.81x 1.85x 1.82x 1.76x 1.67x 1.64x 1.79x 1.81x
AES192 1.77x 1.77x 1.79x 1.88x 1.90x 1.80x 1.69x 1.69x 1.85x 1.81x
AES256 1.79x 1.81x 1.83x 1.89x 1.88x 1.82x 1.72x 1.70x 1.87x 1.89x
Old vs new, on Intel Core2:
ECB/Stream CBC CFB OFB CTR
--------------- --------------- --------------- --------------- ---------------
AES 1.77x 1.75x 1.78x 1.76x 1.76x 1.77x 1.75x 1.76x 1.76x 1.82x
AES192 1.80x 1.73x 1.81x 1.76x 1.79x 1.85x 1.77x 1.76x 1.80x 1.85x
AES256 1.81x 1.77x 1.81x 1.77x 1.80x 1.79x 1.78x 1.77x 1.81x 1.85x
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/Makefile.am: Add 'blowfish-amd64.S'.
* cipher/blowfish-amd64.S: New file.
* cipher/blowfish.c (USE_AMD64_ASM): New macro.
[USE_AMD64_ASM] (_gcry_blowfish_amd64_do_encrypt)
(_gcry_blowfish_amd64_encrypt_block)
(_gcry_blowfish_amd64_decrypt_block, _gcry_blowfish_amd64_ctr_enc)
(_gcry_blowfish_amd64_cbc_dec, _gcry_blowfish_amd64_cfb_dec): New
prototypes.
[USE_AMD64_ASM] (do_encrypt, do_encrypt_block, do_decrypt_block)
(encrypt_block, decrypt_block): New functions.
(_gcry_blowfish_ctr_enc, _gcry_blowfish_cbc_dec)
(_gcry_blowfish_cfb_dec, selftest_ctr, selftest_cbc, selftest_cfb): New
functions.
(selftest): Call new bulk selftests.
* cipher/cipher.c (gcry_cipher_open) [USE_BLOWFISH]: Register Blowfish
bulk functions for ctr-enc, cbc-dec and cfb-dec.
* configure.ac (blowfish) [x86_64]: Add 'blowfish-amd64.lo'.
* src/cipher.h (_gcry_blowfish_ctr_enc, _gcry_blowfish_cbc_dec)
(gcry_blowfish_cfb_dec): New prototypes.
--
Add non-parallel functions for small speed-up and 4-way parallel functions for
modes of operation that support parallel processing.
Speed old vs. new on AMD Phenom II X6 1055T:
ECB/Stream CBC CFB OFB CTR
--------------- --------------- --------------- --------------- ---------------
BLOWFISH 1.21x 1.12x 1.17x 3.52x 1.18x 3.34x 1.16x 1.15x 3.38x 3.47x
Speed old vs. new on Intel Core i5-2450M (Sandy-Bridge):
ECB/Stream CBC CFB OFB CTR
--------------- --------------- --------------- --------------- ---------------
BLOWFISH 1.16x 1.10x 1.17x 2.98x 1.18x 2.88x 1.16x 1.15x 3.00x 3.02x
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/ecc.c (generate_key): Use point_snatch_set, replaces unneeded
variable copies, etc.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
* cipher/dsa.c (gen_k): Factor code out to ..
* cipher/dsa-common.c (_gcry_dsa_gen_k): new file and function. Add
arg security_level and re-indent a bit.
* cipher/ecc.c (gen_k): Remove and change callers to _gcry_dsa_gen_k.
* cipher/dsa.c: Include pubkey-internal.
* cipher/Makefile.am (libcipher_la_SOURCES): Add dsa-common.c
--
The ECDSA code used the simple $k = k \bmod p$ method which introduces
a small bias. We now use the bias free method we have always used
with DSA.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
* cipher/Makefile.am: Add 'cast5-amd64.S'.
* cipher/cast5-amd64.S: New file.
* cipher/cast5.c (USE_AMD64_ASM): New macro.
(_gcry_cast5_s1tos4): Merge arrays s1, s2, s3, s4 to single array to
simplify access from assembly implementation.
(s1, s2, s3, s4): New macros pointing to subarrays in
_gcry_cast5_s1tos4.
[USE_AMD64_ASM] (_gcry_cast5_amd64_encrypt_block)
(_gcry_cast5_amd64_decrypt_block, _gcry_cast5_amd64_ctr_enc)
(_gcry_cast5_amd64_cbc_dec, _gcry_cast5_amd64_cfb_dec): New prototypes.
[USE_AMD64_ASM] (do_encrypt_block, do_decrypt_block, encrypt_block)
(decrypt_block): New functions.
(_gcry_cast5_ctr_enc, _gcry_cast5_cbc_dec, _gcry_cast5_cfb_dec)
(selftest_ctr, selftest_cbc, selftest_cfb): New functions.
(selftest): Call new bulk selftests.
* cipher/cipher.c (gcry_cipher_open) [USE_CAST5]: Register CAST5 bulk
functions for ctr-enc, cbc-dec and cfb-dec.
* configure.ac (cast5) [x86_64]: Add 'cast5-amd64.lo'.
* src/cipher.h (_gcry_cast5_ctr_enc, _gcry_cast5_cbc_dec)
(gcry_cast5_cfb_dec): New prototypes.
--
Provides non-parallel implementations for small speed-up and 4-way parallel
implementations that gets accelerated on `out-of-order' CPUs.
Speed old vs. new on AMD Phenom II X6 1055T:
ECB/Stream CBC CFB OFB CTR
--------------- --------------- --------------- --------------- ---------------
CAST5 1.23x 1.22x 1.21x 2.86x 1.21x 2.83x 1.22x 1.17x 2.73x 2.73x
Speed old vs. new on Intel Core i5-2450M (Sandy-Bridge):
ECB/Stream CBC CFB OFB CTR
--------------- --------------- --------------- --------------- ---------------
CAST5 1.00x 1.04x 1.06x 2.56x 1.06x 2.37x 1.03x 1.01x 2.43x 2.41x
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/cipher-selftest.c (_gcry_selftest_helper_cbc_128)
(_gcry_selftest_helper_cfb_128, _gcry_selftest_helper_ctr_128): Renamed
functions from '<name>_128' to '<name>'.
(_gcry_selftest_helper_cbc, _gcry_selftest_helper_cfb)
(_gcry_selftest_helper_ctr): Make work with different block sizes.
* cipher/cipher-selftest.h (_gcry_selftest_helper_cbc_128)
(_gcry_selftest_helper_cfb_128, _gcry_selftest_helper_ctr_128): Renamed
prototypes from '<name>_128' to '<name>'.
* cipher/camellia-glue.c (selftest_ctr_128, selftest_cfb_128)
(selftest_ctr_128): Change to use new function names.
* cipher/rijndael.c (selftest_ctr_128, selftest_cfb_128)
(selftest_ctr_128): Change to use new function names.
* cipher/serpent.c (selftest_ctr_128, selftest_cfb_128)
(selftest_ctr_128): Change to use new function names.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/cipher.c (gcry_cipher_open): Add bulf CFB decryption function
for Serpent.
* cipher/serpent-sse2-amd64.S (_gcry_serpent_sse2_cfb_dec): New
function.
* cipher/serpent.c (_gcry_serpent_sse2_cfb_dec): New prototype.
(_gcry_serpent_cfb_dec) New function.
(selftest_cfb_128) New function.
(selftest) Call selftest_cfb_128.
* src/cipher.h (_gcry_serpent_cfb_dec): New prototype.
--
Patch makes Serpent-CFB decryption 4.0 times faster on Intel Sandy-Bridge and
2.7 times faster on AMD K10.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/camellia-aesni-avx-amd64.S
(_gcry_camellia_aesni_avx_cfb_dec): New function.
* cipher/camellia-glue.c (_gcry_camellia_aesni_avx_cfb_dec): New
prototype.
(_gcry_camellia_cfb_dec): New function.
(selftest_cfb_128): New function.
(selftest): Call selftest_cfb_128.
* cipher/cipher.c (gry_cipher_open): Add bulk CFB decryption function
for Camellia.
* src/cipher.h (_gcry_camellia_cfb_dec): New prototype.
--
Patch makes Camellia-CFB decryption 4.7 times faster on Intel Sandy-Bridge.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/cipher-selftest.c (_gcry_selftest_helper_cfb_128): New
function for CFB selftests.
* cipher/cipher-selftest.h (_gcry_selftest_helper_cfb_128): New
prototype.
* cipher/rijndael.c [USE_AESNI] (do_aesni_enc_vec4): New function.
(_gcry_aes_cfb_dec) [USE_AESNI]: Add parallelized CFB decryption.
(selftest_cfb_128): New function.
(selftest): Call selftest_cfb_128.
--
CFB decryption can be parallelized for additional performance. On Intel
Sandy-Bridge processor, this change makes CFB decryption 4.6 times faster.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/cipher-selftest.c (_gcry_selftest_helper_cbc_128)
(_gcry_selftest_helper_ctr_128): Rename setkey to setkey_func.
--
setkey is a POSIX.1 function defined in stdlib.
|
|
* configure.ac (serpent): Add 'serpent-sse2-amd64.lo'.
* cipher/Makefile.am (EXTRA_libcipher_la_SOURCES): Add
'serpent-sse2-amd64.S'.
* cipher/cipher.c (gcry_cipher_open) [USE_SERPENT]: Register bulk
functions for CBC-decryption and CTR-mode.
* cipher/serpent.c (USE_SSE2): New macro.
[USE_SSE2] (_gcry_serpent_sse2_ctr_enc, _gcry_serpent_sse2_cbc_dec):
New prototypes to assembler functions.
(serpent_setkey): Set 'serpent_init_done' before calling serpent_test.
(_gcry_serpent_ctr_enc): New function.
(_gcry_serpent_cbc_dec): New function.
(selftest_ctr_128): New function.
(selftest_cbc_128): New function.
(selftest): Call selftest_ctr_128 and selftest_cbc_128.
* cipher/serpent-sse2-amd64.S: New file.
* src/cipher.h (_gcry_serpent_ctr_enc): New prototype.
(_gcry_serpent_cbc_dec): New prototype.
--
[v2]: Converted to SSE2, to support all amd64 processors (SSE2 is required
feature by AMD64 SysV ABI).
Patch adds word-sliced SSE2 implementation of Serpent for amd64 for speeding
up parallelizable workloads (CTR mode, CBC mode decryption). Implementation
processes eight blocks in parallel, with two four-block sets interleaved for
out-of-order scheduling.
Speed old vs. new on Intel Core i5-2450M (Sandy-Bridge):
ECB/Stream CBC CFB OFB CTR
--------------- --------------- --------------- --------------- ---------------
SERPENT128 1.00x 0.99x 1.00x 3.98x 1.00x 1.01x 1.00x 1.01x 4.04x 4.04x
Speed old vs. new on AMD Phenom II X6 1055T:
ECB/Stream CBC CFB OFB CTR
--------------- --------------- --------------- --------------- ---------------
SERPENT128 1.02x 1.01x 1.00x 2.83x 1.00x 1.00x 1.00x 1.00x 2.72x 2.72x
Speed old vs. new on Intel Core2 Duo T8100:
ECB/Stream CBC CFB OFB CTR
--------------- --------------- --------------- --------------- ---------------
SERPENT128 1.00x 1.02x 0.97x 4.02x 0.98x 1.01x 0.98x 1.00x 3.82x 3.91x
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/serpent.c (SBOX0, SBOX1, SBOX2, SBOX3, SBOX4, SBOX5, SBOX6)
(SBOX7, SBOX0_INVERSE, SBOX1_INVERSE, SBOX2_INVERSE, SBOX3_INVERSE)
(SBOX4_INVERSE, SBOX5_INVERSE, SBOX6_INVERSE, SBOX7_INVERSE): Replace
with new definitions.
--
These new S-box definitions are from paper:
D. A. Osvik, “Speeding up Serpent,” in Third AES Candidate Conference,
(New York, New York, USA), p. 317–329, National Institute of Standards and
Technology, 2000. Available at http://www.ii.uib.no/~osvik/pub/aes3.ps.gz
Although these were optimized for two-operand instructions on i386 and for
old Pentium-1 processors, they are slightly faster on current processors
on i386 and x86-64. On ARM, the performance of these S-boxes is about the
same as with the old S-boxes.
new vs old speed ratios (AMD K10, x86-64):
ECB/Stream CBC CFB OFB CTR
--------------- --------------- --------------- --------------- ---------------
SERPENT128 1.06x 1.02x 1.06x 1.02x 1.06x 1.06x 1.06x 1.05x 1.07x 1.07x
new vs old speed ratios (Intel Atom, i486):
ECB/Stream CBC CFB OFB CTR
--------------- --------------- --------------- --------------- ---------------
SERPENT128 1.12x 1.15x 1.12x 1.15x 1.13x 1.11x 1.12x 1.12x 1.12x 1.13x
new vs old speed ratios (ARM Cortex A8):
ECB/Stream CBC CFB OFB CTR
--------------- --------------- --------------- --------------- ---------------
SERPENT128 1.04x 1.02x 1.02x 0.99x 1.02x 1.02x 1.03x 1.03x 1.01x 1.01x
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* src/Makefile.am (install-def-file): Create libdir first.
--
Reported-by: LRN <lrn1986@gmail.com>
|
|
--
|
|
* src/gcrypt.h.in (GCRYCTL_DISABLE_LOCKED_SECMEM): New.
(GCRYCTL_DISABLE_PRIV_DROP): New.
* src/global.c (_gcry_vcontrol): Implement them.
* src/secmem.h (GCRY_SECMEM_FLAG_NO_MLOCK): New.
(GCRY_SECMEM_FLAG_NO_PRIV_DROP): New.
* src/secmem.c (no_mlock, no_priv_drop): New.
(_gcry_secmem_set_flags, _gcry_secmem_get_flags): Set and get them.
(lock_pool): Handle no_mlock and no_priv_drop.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
* ltmain.sh (sed_uncomment_deffile): New.
(orig_export_symbols): Uncomment def file before testing for EXPORTS.
* m4/libtool.m4: Do the same for the generated code.
--
The old code was not correct in that it only looked at the first line
and puts an EXPORTS keyword in front if missing. Binutils 2.22
accepted a duplicated EXPORTS keyword but at least 2.23.2 is more
stringent and bails out without this fix.
There is no need to send this upstream. Upstream's git master has a
lot of changes including a similar fix for this problems. There are
no signs that a libtool 2.4.3 will be released to fix this problem and
thus we need to stick to our copy of 2.4.2 along with this patch.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
* cipher/rinjdael.c (selftest_cbc_128): New.
(selftest): Call selftest_cbc_128.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/rinjdael.c: (selftest_ctr_128): Change to use new selftest
helper function.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
helper functions
* cipher/Makefile.am (libcipher_la_SOURCES): Add cipher-selftest files.
* cipher/camellia-glue.c (selftest_ctr_128, selftest_cbc_128): Change
to use the new selftest helper functions.
* cipher/cipher-selftest.c: New.
* cipher/cipher-selftest.h: New.
--
Convert selftest functions into generic helper functions for code sharing.
[v2]: use syslog for more detailed selftest error messages
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/camellia-glue.c: (selftest_cbc_128): New selftest function for
bulk CBC decryption.
(selftest): Add call to selftest_cbc_128.
--
Add selftest for the parallel code paths in bulk CBC decryption.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/camellia_aesni_avx_x86-64.S: Remove.
* cipher/camellia-aesni-avx-amd64.S: New.
* cipher/Makefile.am: Use the new filename.
* configure.ac: Use the new filename.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/ecc.c (generate_key): Use the same string for both fatal
messages.
|