summaryrefslogtreecommitdiff
path: root/cipher
AgeCommit message (Collapse)AuthorFilesLines
2014-11-02Add ARM/NEON implementation of Poly1305Jussi Kivilinna4-1/+747
* cipher/Makefile.am: Add 'poly1305-armv7-neon.S'. * cipher/poly1305-armv7-neon.S: New. * cipher/poly1305-internal.h (POLY1305_USE_NEON) (POLY1305_NEON_BLOCKSIZE, POLY1305_NEON_STATESIZE) (POLY1305_NEON_ALIGNMENT): New. * cipher/poly1305.c [POLY1305_USE_NEON] (_gcry_poly1305_armv7_neon_init_ext) (_gcry_poly1305_armv7_neon_finish_ext) (_gcry_poly1305_armv7_neon_blocks, poly1305_armv7_neon_ops): New. (_gcry_poly1305_init) [POLY1305_USE_NEON]: Select NEON implementation if HWF_ARM_NEON set. * configure.ac [neonsupport=yes]: Add 'poly1305-armv7-neon.lo'. -- Add Andrew Moon's public domain NEON implementation of Poly1305. Original source is available at: https://github.com/floodyberry/poly1305-opt Benchmark on Cortex-A8 (--cpu-mhz 1008): Old: | nanosecs/byte mebibytes/sec cycles/byte POLY1305 | 12.34 ns/B 77.27 MiB/s 12.44 c/B New: | nanosecs/byte mebibytes/sec cycles/byte POLY1305 | 2.12 ns/B 450.7 MiB/s 2.13 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2014-11-02chacha20: add ARMv7/NEON implementationJussi Kivilinna3-0/+745
* cipher/Makefile.am: Add 'chacha20-armv7-neon.S'. * cipher/chacha20-armv7-neon.S: New. * cipher/chacha20.c (USE_NEON): New. [USE_NEON] (_gcry_chacha20_armv7_neon_blocks): New. (chacha20_do_setkey) [USE_NEON]: Use Neon implementation if HWF_ARM_NEON flag set. (selftest): Self-test encrypting buffer byte by byte. * configure.ac [neonsupport=yes]: Add 'chacha20-armv7-neon.lo'. -- Add Andrew Moon's public domain ARMv7/NEON implementation of ChaCha20. Original source is available at: https://github.com/floodyberry/chacha-opt Benchmark on Cortex-A8 (--cpu-mhz 1008): Old: CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 13.45 ns/B 70.92 MiB/s 13.56 c/B STREAM dec | 13.45 ns/B 70.90 MiB/s 13.56 c/B New: CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 6.20 ns/B 153.9 MiB/s 6.25 c/B STREAM dec | 6.20 ns/B 153.9 MiB/s 6.25 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2014-10-08Fix prime test for 2 and lower and add check command to mpicalc.Werner Koch1-9/+10
* cipher/primegen.c (check_prime): Return true for the small primes. (_gcry_prime_check): Return correct values for 2 and lower numbers. * src/mpicalc.c (do_primecheck): New. (main): Add command 'P'. (main): Allow for larger input data.
2014-10-04Add Whirlpool AMD64/SSE2 assembly implementationJussi Kivilinna3-37/+391
* cipher/Makefile.am: Add 'whirlpool-sse2-amd64.S'. * cipher/whirlpool-sse2-amd64.S: New. * cipher/whirlpool.c (USE_AMD64_ASM): New. (whirlpool_tables_s): New. (rc, C0, C1, C2, C3, C4, C5, C6, C7): Combine these tables into single structure and replace old tables with macros of same name. (tab): New structure containing above tables. [USE_AMD64_ASM] (_gcry_whirlpool_transform_amd64) (whirlpool_transform): New. * configure.ac [host=x86_64]: Add 'whirlpool-sse2-amd64.lo'. -- Benchmark results: On Intel Core i5-4570 (3.2 Ghz): After: WHIRLPOOL | 4.82 ns/B 197.8 MiB/s 15.43 c/B Before: WHIRLPOOL | 9.10 ns/B 104.8 MiB/s 29.13 c/B On Intel Core i5-2450M (2.5 Ghz): After: WHIRLPOOL | 8.43 ns/B 113.1 MiB/s 21.09 c/B Before: WHIRLPOOL | 13.45 ns/B 70.92 MiB/s 33.62 c/B On Intel Core2 T8100 (2.1 Ghz): After: WHIRLPOOL | 10.22 ns/B 93.30 MiB/s 21.47 c/B Before: WHIRLPOOL | 19.87 ns/B 48.00 MiB/s 41.72 c/B Summary, old vs new ratio: Intel Core i5-4570: 1.88x Intel Core i5-2450M: 1.59x Intel Core2 T8100: 1.94x Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2014-10-04Improved ripemd160 performanceAndrei Scherer1-189/+178
* cipher/rmd160.c (transform): Interleave the left and right lane rounds to introduce more instruction level parallelism. -- The benchmarks on different systems: Intel(R) Atom(TM) CPU N570 @ 1.66GHz before: Hash: | nanosecs/byte mebibytes/sec cycles/byte RIPEMD160 | 13.07 ns/B 72.97 MiB/s - c/B after: Hash: | nanosecs/byte mebibytes/sec cycles/byte RIPEMD160 | 11.37 ns/B 83.84 MiB/s - c/B Intel(R) Core(TM) i5-4670 CPU @ 3.40GHz before: Hash: | nanosecs/byte mebibytes/sec cycles/byte RIPEMD160 | 3.31 ns/B 288.0 MiB/s - c/B after: Hash: | nanosecs/byte mebibytes/sec cycles/byte RIPEMD160 | 2.08 ns/B 458.5 MiB/s - c/B Signed-off-by: Andrei Scherer <andsch@inbox.com>
2014-09-30mac: Fix gcry_mac_close to allow for a NULL handle.Werner Koch1-1/+2
* cipher/mac.c (_gcry_mac_close): Check for NULL. -- We always allow this for easier cleanup. actually the docs already tell that this is allowed.
2014-08-21cipher: Fix a segv in case of calling with wrong parameters.Werner Koch1-1/+1
* cipher/md.c (_gcry_md_info): Fix arg testing. -- GnuPG-bug-id: 1697
2014-08-21cipher: Fix possible NULL deref in call to prime generator.Werner Koch3-18/+41
* cipher/primegen.c (_gcry_generate_elg_prime): Change to return an error code. * cipher/dsa.c (generate): Take care of new return code. * cipher/elgamal.c (generate): Change to return an error code. Take care of _gcry_generate_elg_prime return code. (generate_using_x): Take care of _gcry_generate_elg_prime return code. (elg_generate): Propagate return code from generate. -- GnuPG-bug-id: 1699, 1700 Reported-by: S.K. Gupta Note that the NULL deref may have only happened on malloc failure.
2014-08-08ecc: Add cofactor to domain parameters.NIIBE Yutaka5-72/+151
* src/ec-context.h (mpi_ec_ctx_s): Add cofactor 'h'. * cipher/ecc-common.h (elliptic_curve_t): Add cofactor 'h'. (_gcry_ecc_update_curve_param): New API adding cofactor. * cipher/ecc-curves.c (ecc_domain_parms_t): Add cofactor 'h'. (ecc_domain_parms_t domain_parms): Add cofactors. (_gcry_ecc_fill_in_curve, _gcry_ecc_update_curve_param) (_gcry_ecc_get_curve, _gcry_mpi_ec_new, _gcry_ecc_get_param_sexp) (_gcry_ecc_get_mpi): Handle cofactor. * cipher/ecc-eddsa.c (_gcry_ecc_eddsa_genkey): Likewise. * cipher/ecc-misc.c (_gcry_ecc_curve_free) (_gcry_ecc_curve_copy): Likewise. * cipher/ecc.c (nist_generate_key, ecc_generate) (ecc_check_secret_key, ecc_sign, ecc_verify, ecc_encrypt_raw) (ecc_decrypt_raw, _gcry_pk_ecc_get_sexp, _gcry_pubkey_spec_ecc): Likewise. (compute_keygrip): Handle cofactor, but skip it for its computation. * mpi/ec.c (ec_deinit): Likewise. * tests/t-mpi-point.c (context_param): Likewise. (test_curve): Add cofactors. * tests/curves.c (sample_key_1, sample_key_2): Add cofactors. * tests/keygrip.c (key_grips): Add cofactors. -- We keep compatibility of compute_keygrip in cipher/ecc.c.
2014-07-25ecc: Support the non-standard 0x40 compression flag for EdDSA.Werner Koch4-67/+99
* cipher/ecc.c (ecc_generate): Check the "comp" flag for EdDSA. * cipher/ecc-eddsa.c (eddsa_encode_x_y): Add arg WITH_PREFIX. (_gcry_ecc_eddsa_encodepoint): Ditto. (_gcry_ecc_eddsa_ensure_compact): Handle the 0x40 compression prefix. (_gcry_ecc_eddsa_decodepoint): Ditto. * tests/keygrip.c: Check an compresssed with prefix Ed25519 key. * tests/t-ed25519.inp: Ditto.
2014-07-25cipher: Fix compiler warning for chacha20.Werner Koch1-0/+3
* cipher/chacha20.c (chacha20_blocks) [!USE_SSE2]: Do not build.
2014-06-29Speed-up SHA-1 NEON assembly implementationJussi Kivilinna1-73/+82
* cipher/sha1-armv7-neon.S: Tweak implementation for speed-up. -- Benchmark on Cortex-A8 1008Mhz: New: | nanosecs/byte mebibytes/sec cycles/byte SHA1 | 7.04 ns/B 135.4 MiB/s 7.10 c/B Old: | nanosecs/byte mebibytes/sec cycles/byte SHA1 | 7.79 ns/B 122.4 MiB/s 7.85 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2014-06-28gostr3411_94: rewrite to use u32 mathematicDmitry Eremin-Solenikov3-103/+139
* cipher/gost28147.c (_gcry_gost_enc_data): New. * cipher/gostr3411-94.c: Rewrite implementation to use u32 mathematic internally. * cipher/gost28147.c (_gcry_gost_enc_one): Remove. -- On my box (Core2 Duo, i386) this highly improves GOST R 34.11-94 speed. Before: GOSTR3411_94 | 55.04 ns/B 17.33 MiB/s - c/B After: GOSTR3411_94 | 36.70 ns/B 25.99 MiB/s - c/B Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2014-06-28gost28147: use bufhelp helpersDmitry Eremin-Solenikov1-36/+10
* cipher/gost28147.c (gost_setkey, gost_encrypt_block, gost_decrypt_block): use buf_get_le32/buf_put_le32 helpers. -- On my box this boosts GOST 28147-89 speed from 36 MiB/s up to 44.5 MiB/s. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2014-06-28Add GOST R 34.11-94 variant using id-GostR3411-94-CryptoProParamSetDmitry Eremin-Solenikov4-8/+31
* src/gcrypt.h.in (GCRY_MD_GOSTR3411_CP): New. * src/cipher.h (_gcry_digest_spec_gost3411_cp): New. * cipher/gost28147.c (_gcry_gost_enc_one): Differentiate between CryptoPro and Test S-Boxes. * cipher/gostr3411-94.c (_gcry_digest_spec_gost3411_cp, gost3411_cp_init): New. * cipher/md.c (md_open): GCRY_MD_GOSTR3411_CP also uses B=32. -- RFC4357 defines only two S-Boxes that should be used together with GOST R 34.11-94 - a testing one (from standard itself, for testing only) and CryptoPro one. Instead of adding a separate gcry_md_ctrl() function just to switch s-boxes, add a separate MD algorithm using CryptoPro S-box. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2014-06-28gost28147: support GCRYCTL_SET_SBOXDmitry Eremin-Solenikov1-0/+39
cipher/gost28147.c (gost_set_extra_info, gost_set_sbox): New. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2014-06-28Support setting s-box for the ciphers that require itDmitry Eremin-Solenikov1-0/+7
* src/gcrypt.h.in (GCRYCTL_SET_SBOX, gcry_cipher_set_sbox): New. * cipher/cipher.c (_gcry_cipher_ctl): pass GCRYCTL_SET_SBOX to set_extra_info callback. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2014-06-28cipher/gost28147: generate optimized s-boxes from compact onesDmitry Eremin-Solenikov4-274/+270
* cipher/gost-s-box.c: New. Outputs optimized expanded representation of s-boxes (4x256) from compact 16x8 representation. * cipher/Makefile.am: Add gost-sb.h dependency to gost28147.lo * cipher/gost.h: Add sbox to the GOST28147_context structure. * cipher/gost28147.c (gost_setkey): Set default s-box to test s-box from GOST R 34.11 (this was the only one S-box before). * cipher/gost28147.c (gost_val): Use sbox from the context. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2014-06-28gost28147: add OIDs used to define cipher modeDmitry Eremin-Solenikov1-1/+11
* cipher/gost28147 (oids_gost28147): Add OID from RFC4357. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2014-06-28GOST R 34.11-94 add OIDsDmitry Eremin-Solenikov1-1/+14
* cipher/gostr3411-94.c: Add OIDs for GOST R 34.11-94 from RFC 4357. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2014-05-21sha512: fix ARM/NEON implementationJussi Kivilinna1-1/+1
* cipher/sha512-armv7-neon.S (_gcry_sha512_transform_armv7_neon): Byte-swap RW67q and RW1011q correctly in multi-block loop. * tests/basic.c (check_digests): Add large test vector for SHA512. -- Patch fixes bug introduced to multi-block processing by commit df629ba53a6, "Improve performance of SHA-512/ARM/NEON implementation". Patch also adds multi-block test vector for SHA-512. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2014-05-20Fix ARM assembly when building __PIC__Jussi Kivilinna4-10/+64
* cipher/camellia-arm.S (GET_DATA_POINTER): New. (_gcry_camellia_arm_encrypt_block): Use GET_DATA_POINTER. (_gcry_camellia_arm_decrypt_block): Ditto. * cipher/cast5-arm.S (GET_DATA_POINTER): New. (_gcry_cast5_arm_encrypt_block, _gcry_cast5_arm_decrypt_block) (_gcry_cast5_arm_enc_blk2, _gcry_cast5_arm_dec_blk2): Use GET_DATA_POINTER. * cipher/rijndael-arm.S (GET_DATA_POINTER): New. (_gcry_aes_arm_encrypt_block, _gcry_aes_arm_decrypt_block): Use GET_DATA_POINTER. * cipher/sha1-armv7-neon.S (GET_DATA_POINTER): New. (.LK_VEC): Move from .text to .data section. (_gcry_sha1_transform_armv7_neon): Use GET_DATA_POINTER. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2014-05-16chacha20: add SSE2/AMD64 optimized implementationJussi Kivilinna3-1/+671
* cipher/Makefile.am: Add 'chacha20-sse2-amd64.S'. * cipher/chacha20-sse2-amd64.S: New. * cipher/chacha20.c (USE_SSE2): New. [USE_SSE2] (_gcry_chacha20_amd64_sse2_blocks): New. (chacha20_do_setkey) [USE_SSE2]: Use SSE2 implementation for blocks function. * configure.ac [host=x86-64]: Add 'chacha20-sse2-amd64.lo'. -- Add Andrew Moon's public domain SSE2 implementation of ChaCha20. Original source is available at: https://github.com/floodyberry/chacha-opt Benchmark on Intel i5-4570 (haswell), with "--disable-hwf intel-avx2 --disable-hwf intel-ssse3": Old: CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 1.97 ns/B 483.8 MiB/s 6.31 c/B STREAM dec | 1.97 ns/B 483.6 MiB/s 6.31 c/B New: CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 0.931 ns/B 1024.7 MiB/s 2.98 c/B STREAM dec | 0.930 ns/B 1025.0 MiB/s 2.98 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2014-05-16poly1305: add AMD64/AVX2 optimized implementationJussi Kivilinna4-4/+1001
* cipher/Makefile.am: Add 'poly1305-avx2-amd64.S'. * cipher/poly1305-avx2-amd64.S: New. * cipher/poly1305-internal.h (POLY1305_USE_AVX2) (POLY1305_AVX2_BLOCKSIZE, POLY1305_AVX2_STATESIZE) (POLY1305_AVX2_ALIGNMENT): New. (POLY1305_LARGEST_BLOCKSIZE, POLY1305_LARGEST_STATESIZE) (POLY1305_STATE_ALIGNMENT): Use AVX2 versions when needed. * cipher/poly1305.c [POLY1305_USE_AVX2] (_gcry_poly1305_amd64_avx2_init_ext) (_gcry_poly1305_amd64_avx2_finish_ext) (_gcry_poly1305_amd64_avx2_blocks, poly1305_amd64_avx2_ops): New. (_gcry_poly1305_init) [POLY1305_USE_AVX2]: Use AVX2 implementation if AVX2 supported by CPU. * configure.ac [host=x86_64]: Add 'poly1305-avx2-amd64.lo'. -- Add Andrew Moon's public domain AVX2 implementation of Poly1305. Original source is available at: https://github.com/floodyberry/poly1305-opt Benchmarks on Intel i5-4570 (haswell): Old: | nanosecs/byte mebibytes/sec cycles/byte POLY1305 | 0.448 ns/B 2129.5 MiB/s 1.43 c/B New: | nanosecs/byte mebibytes/sec cycles/byte POLY1305 | 0.205 ns/B 4643.5 MiB/s 0.657 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2014-05-12poly1305: add AMD64/SSE2 optimized implementationJussi Kivilinna4-3/+1084
* cipher/Makefile.am: Add 'poly1305-sse2-amd64.S'. * cipher/poly1305-internal.h (POLY1305_USE_SSE2) (POLY1305_SSE2_BLOCKSIZE, POLY1305_SSE2_STATESIZE) (POLY1305_SSE2_ALIGNMENT): New. (POLY1305_LARGEST_BLOCKSIZE, POLY1305_LARGEST_STATESIZE) (POLY1305_STATE_ALIGNMENT): Use SSE2 versions when needed. * cipher/poly1305-sse2-amd64.S: New. * cipher/poly1305.c [POLY1305_USE_SSE2] (_gcry_poly1305_amd64_sse2_init_ext) (_gcry_poly1305_amd64_sse2_finish_ext) (_gcry_poly1305_amd64_sse2_blocks, poly1305_amd64_sse2_ops): New. (_gcry_polu1305_init) [POLY1305_USE_SSE2]: Use SSE2 version. * configure.ac [host=x86_64]: Add 'poly1305-sse2-amd64.lo'. -- Add Andrew Moon's public domain SSE2 implementation of Poly1305. Original source is available at: https://github.com/floodyberry/poly1305-opt Benchmarks on Intel i5-4570 (haswell): Old: | nanosecs/byte mebibytes/sec cycles/byte POLY1305 | 0.844 ns/B 1130.2 MiB/s 2.70 c/B New: | nanosecs/byte mebibytes/sec cycles/byte POLY1305 | 0.448 ns/B 2129.5 MiB/s 1.43 c/B Benchmarks on Intel i5-2450M (sandy-bridge): Old: | nanosecs/byte mebibytes/sec cycles/byte POLY1305 | 1.25 ns/B 763.0 MiB/s 3.12 c/B New: | nanosecs/byte mebibytes/sec cycles/byte POLY1305 | 0.605 ns/B 1575.9 MiB/s 1.51 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2014-05-12Add Poly1305 based cipher AEAD modeJussi Kivilinna4-5/+382
* cipher/Makefile.am: Add 'cipher-poly1305.c'. * cipher/cipher-internal.h (gcry_cipher_handle): Add 'u_mode.poly1305'. (_gcry_cipher_poly1305_encrypt, _gcry_cipher_poly1305_decrypt) (_gcry_cipher_poly1305_setiv, _gcry_cipher_poly1305_authenticate) (_gcry_cipher_poly1305_get_tag, _gcry_cipher_poly1305_check_tag): New. * cipher/cipher-poly1305.c: New. * cipher/cipher.c (_gcry_cipher_open_internal, cipher_setkey) (cipher_reset, cipher_encrypt, cipher_decrypt, _gcry_cipher_setiv) (_gcry_cipher_authenticate, _gcry_cipher_gettag) (_gcry_cipher_checktag): Handle 'GCRY_CIPHER_MODE_POLY1305'. (cipher_setiv): Move handling of 'GCRY_CIPHER_MODE_GCM' to ... (_gcry_cipher_setiv): ... here, as with other modes. * src/gcrypt.h.in: Add 'GCRY_CIPHER_MODE_POLY1305'. * tests/basic.c (_check_poly1305_cipher, check_poly1305_cipher): New. (check_ciphers): Add Poly1305 check. (check_cipher_modes): Call 'check_poly1305_cipher'. * tests/bench-slope.c (bench_gcm_encrypt_do_bench): Rename to bench_aead_... and take nonce as argument. (bench_gcm_decrypt_do_bench, bench_gcm_authenticate_do_bench): Ditto. (bench_gcm_encrypt_do_bench, bench_gcm_decrypt_do_bench) (bench_gcm_authenticate_do_bench, bench_poly1305_encrypt_do_bench) (bench_poly1305_decrypt_do_bench) (bench_poly1305_authenticate_do_bench, poly1305_encrypt_ops) (poly1305_decrypt_ops, poly1305_authenticate_ops): New. (cipher_modes): Add Poly1305. (cipher_bench_one): Add special handling for Poly1305. -- Patch adds Poly1305 based AEAD cipher mode to libgcrypt. ChaCha20 variant of this mode is proposed for use in TLS and ipsec: https://tools.ietf.org/html/draft-agl-tls-chacha20poly1305-04 http://tools.ietf.org/html/draft-nir-ipsecme-chacha20-poly1305-02 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2014-05-12Add Poly1305-AES (-Camellia, etc) MACsJussi Kivilinna3-14/+180
* cipher/mac-internal.h (_gcry_mac_type_spec_poly1305_aes) (_gcry_mac_type_spec_poly1305_camellia) (_gcry_mac_type_spec_poly1305_twofish) (_gcry_mac_type_spec_poly1305_serpent) (_gcry_mac_type_spec_poly1305_seed): New. * cipher/mac-poly1305.c (poly1305mac_context_s): Add 'hd' and 'nonce_set'. (poly1305mac_open, poly1305mac_close, poly1305mac_setkey): Add handling for Poly1305-*** MACs. (poly1305mac_prepare_key, poly1305mac_setiv): New. (poly1305mac_reset, poly1305mac_write, poly1305mac_read): Add handling for 'nonce_set'. (poly1305mac_ops): Add 'poly1305mac_setiv'. (_gcry_mac_type_spec_poly1305_aes) (_gcry_mac_type_spec_poly1305_camellia) (_gcry_mac_type_spec_poly1305_twofish) (_gcry_mac_type_spec_poly1305_serpent) (_gcry_mac_type_spec_poly1305_seed): New. * cipher/mac.c (mac_list): Add Poly1305-AES, Poly1305-Twofish, Poly1305-Serpent, Poly1305-SEED and Poly1305-Camellia. * src/gcrypt.h.in: Add 'GCRY_MAC_POLY1305_AES', 'GCRY_MAC_POLY1305_CAMELLIA', 'GCRY_MAC_POLY1305_TWOFISH', 'GCRY_MAC_POLY1305_SERPENT' and 'GCRY_MAC_POLY1305_SEED'. * tests/basic.c (check_mac): Add Poly1305-AES test vectors. * tests/bench-slope.c (bench_mac_init): Set IV for Poly1305-*** MACs. * tests/bench-slope.c (mac_bench): Set IV for Poly1305-*** MACs. -- Patch adds Bernstein's Poly1305-AES message authentication code to libgcrypt and other variants of Poly1305-<128-bit block cipher>. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2014-05-12Add Poly1305 MACJussi Kivilinna6-2/+1091
* cipher/Makefile.am: Add 'mac-poly1305.c', 'poly1305.c' and 'poly1305-internal.h'. * cipher/mac-internal.h (poly1305mac_context_s): New. (gcry_mac_handle): Add 'u.poly1305mac'. (_gcry_mac_type_spec_poly1305mac): New. * cipher/mac-poly1305.c: New. * cipher/mac.c (mac_list): Add Poly1305. * cipher/poly1305-internal.h: New. * cipher/poly1305.c: New. * src/gcrypt.h.in: Add 'GCRY_MAC_POLY1305'. * tests/basic.c (check_mac): Add Poly1035 test vectors; Allow overriding lengths of data and key buffers. * tests/bench-slope.c (mac_bench): Increase max algo number from 500 to 600. * tests/benchmark.c (mac_bench): Ditto. -- Patch adds Bernstein's Poly1305 message authentication code to libgcrypt. Implementation is based on Andrew Moon's public domain implementation from: https://github.com/floodyberry/poly1305-opt The algorithm added by this patch is the plain Poly1305 without AES and takes 32-bit key that must not be reused. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2014-05-12chacha20/AVX2: clear upper-halfs of YMM registers on entryJussi Kivilinna1-0/+1
* cipher/chacha20-avx2-amd64.S (_gcry_chacha20_amd64_avx2_blocks): Add 'vzeroupper' at beginning. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2014-05-12chacha20/AVX2: check for ENABLE_AVX2_SUPPORT instead of HAVE_GCC_INLINE_ASM_AVX2Jussi Kivilinna2-2/+2
* cipher/chacha20.c (USE_AVX2): Enable depending on ENABLE_AVX2_SUPPORT, not HAVE_GCC_INLINE_ASM_AVX2. * cipher/chacha20-avx2-amd64.S: Ditto. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2014-05-12chacha20/SSSE3: clear XMM registers after useJussi Kivilinna1-0/+16
* cipher/chacha20-ssse3-amd64.S (_gcry_chacha20_amd64_ssse3_blocks): On return, clear XMM registers. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2014-05-11chacha20: add AVX2/AMD64 assembly implementationJussi Kivilinna3-2/+969
* cipher/Makefile.am: Add 'chacha20-avx2-amd64.S'. * cipher/chacha20-avx2-amd64.S: New. * cipher/chacha20.c (USE_AVX2): New macro. [USE_AVX2] (_gcry_chacha20_amd64_avx2_blocks): New. (chacha20_do_setkey): Select AVX2 implementation if there is HW support. (selftest): Increase size of buf by 256. * configure.ac [host=x86-64]: Add 'chacha20-avx2-amd64.lo'. -- Add AVX2 optimized implementation for ChaCha20. Based on implementation by Andrew Moon. SSSE3 (Intel Haswell): CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 0.742 ns/B 1284.8 MiB/s 2.38 c/B STREAM dec | 0.741 ns/B 1286.5 MiB/s 2.37 c/B AVX2: CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 0.393 ns/B 2428.0 MiB/s 1.26 c/B STREAM dec | 0.392 ns/B 2433.6 MiB/s 1.25 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2014-05-11chacha20: add SSSE3 assembly implementationJussi Kivilinna3-1/+633
* cipher/Makefile.am: Add 'chacha20-ssse3-amd64.S'. * cipher/chacha20-ssse3-amd64.S: New. * cipher/chacha20.c (USE_SSSE3): New macro. [USE_SSSE3] (_gcry_chacha20_amd64_ssse3_blocks): New. (chacha20_do_setkey): Select SSSE3 implementation if there is HW support. * configure.ac [host=x86-64]: Add 'chacha20-ssse3-amd64.lo'. -- Add SSSE3 optimized implementation for ChaCha20. Based on implementation by Andrew Moon. Before (Intel Haswell): CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 1.97 ns/B 483.6 MiB/s 6.31 c/B STREAM dec | 1.97 ns/B 484.0 MiB/s 6.31 c/B After: CHACHA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 0.742 ns/B 1284.8 MiB/s 2.38 c/B STREAM dec | 0.741 ns/B 1286.5 MiB/s 2.37 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2014-05-11Add ChaCha20 stream cipherJussi Kivilinna3-0/+508
* cipher/Makefile.am: Add 'chacha20.c'. * cipher/chacha20.c: New. * cipher/cipher.c (cipher_list): Add ChaCha20. * configure.ac: Add ChaCha20. * doc/gcrypt.texi: Add ChaCha20. * src/cipher.h (_gcry_cipher_spec_chacha20): New. * src/gcrypt.h.in (GCRY_CIPHER_CHACHA20): Add new algo. * tests/basic.c (MAX_DATA_LEN): Increase to 128 from 100. (check_stream_cipher): Add ChaCha20 test-vectors. (check_ciphers): Add ChaCha20. -- Patch adds Bernstein's ChaCha20 cipher to libgcrypt. Implementation is based on public domain implementations. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2014-04-16pubkey: Re-map all depreccated RSA algo numbers.Werner Koch1-8/+6
* cipher/pubkey.c (map_algo): Mape RSA_E and RSA_S.
2014-04-15cipher: Fix possible NULL dereference.Werner Koch2-5/+2
* cipher/md.c (_gcry_md_selftest): Check for spec being NULL. -- Also removed left-over code in unused file cipher/test-getrusage.c. Found by Hans-Christoph Steiner with cppcheck.
2014-03-303des: add amd64 assembly implementation for 3DESJussi Kivilinna5-7/+1342
* cipher/Makefile.am: Add 'des-amd64.S'. * cipher/cipher-selftests.c (_gcry_selftest_helper_cbc) (_gcry_selftest_helper_cfb, _gcry_selftest_helper_ctr): Handle failures from 'setkey' function. * cipher/cipher.c (_gcry_cipher_open_internal) [USE_DES]: Setup bulk functions for 3DES. * cipher/des-amd64.S: New file. * cipher/des.c (USE_AMD64_ASM, ATTR_ALIGNED_16): New macros. [USE_AMD64_ASM] (_gcry_3des_amd64_crypt_block) (_gcry_3des_amd64_ctr_enc), _gcry_3des_amd64_cbc_dec) (_gcry_3des_amd64_cfb_dec): New prototypes. [USE_AMD64_ASM] (tripledes_ecb_crypt): New function. (TRIPLEDES_ECB_BURN_STACK): New macro. (_gcry_3des_ctr_enc, _gcry_3des_cbc_dec, _gcry_3des_cfb_dec) (bulk_selftest_setkey, selftest_ctr, selftest_cbc, selftest_cfb): New functions. (selftest): Add call to CTR, CBC and CFB selftest functions. (do_tripledes_encrypt, do_tripledes_decrypt): Use TRIPLEDES_ECB_BURN_STACK. * configure.ac [host=x86-64]: Add 'des-amd64.lo'. * src/cipher.h (_gcry_3des_ctr_enc, _gcry_3des_cbc_dec) (_gcry_3des_cfb_dec): New prototypes. -- Add non-parallel functions for small speed-up and 3-way parallel functions for modes of operation that support parallel processing. Old vs new (Intel Core i5-4570): ================================ enc dec ECB 1.17x 1.17x CBC 1.17x 2.51x CFB 1.16x 2.49x OFB 1.17x 1.17x CTR 2.56x 2.56x Old vs new (Intel Core i5-2450M): ================================= enc dec ECB 1.28x 1.28x CBC 1.27x 2.33x CFB 1.27x 2.34x OFB 1.27x 1.27x CTR 2.36x 2.35x New (Intel Core i5-4570): ========================= 3DES | nanosecs/byte mebibytes/sec cycles/byte ECB enc | 28.39 ns/B 33.60 MiB/s 90.84 c/B ECB dec | 28.27 ns/B 33.74 MiB/s 90.45 c/B CBC enc | 29.50 ns/B 32.33 MiB/s 94.40 c/B CBC dec | 13.35 ns/B 71.45 MiB/s 42.71 c/B CFB enc | 29.59 ns/B 32.23 MiB/s 94.68 c/B CFB dec | 13.41 ns/B 71.12 MiB/s 42.91 c/B OFB enc | 28.90 ns/B 33.00 MiB/s 92.47 c/B OFB dec | 28.90 ns/B 33.00 MiB/s 92.48 c/B CTR enc | 13.39 ns/B 71.20 MiB/s 42.86 c/B CTR dec | 13.39 ns/B 71.21 MiB/s 42.86 c/B Old (Intel Core i5-4570): ========================= 3DES | nanosecs/byte mebibytes/sec cycles/byte ECB enc | 33.24 ns/B 28.69 MiB/s 106.4 c/B ECB dec | 33.26 ns/B 28.67 MiB/s 106.4 c/B CBC enc | 34.45 ns/B 27.69 MiB/s 110.2 c/B CBC dec | 33.45 ns/B 28.51 MiB/s 107.1 c/B CFB enc | 34.43 ns/B 27.70 MiB/s 110.2 c/B CFB dec | 33.41 ns/B 28.55 MiB/s 106.9 c/B OFB enc | 33.79 ns/B 28.22 MiB/s 108.1 c/B OFB dec | 33.79 ns/B 28.22 MiB/s 108.1 c/B CTR enc | 34.27 ns/B 27.83 MiB/s 109.7 c/B CTR dec | 34.27 ns/B 27.83 MiB/s 109.7 c/B New (Intel Core i5-2450M): ========================== 3DES | nanosecs/byte mebibytes/sec cycles/byte ECB enc | 42.21 ns/B 22.59 MiB/s 105.5 c/B ECB dec | 42.23 ns/B 22.58 MiB/s 105.6 c/B CBC enc | 43.70 ns/B 21.82 MiB/s 109.2 c/B CBC dec | 23.25 ns/B 41.02 MiB/s 58.12 c/B CFB enc | 43.71 ns/B 21.82 MiB/s 109.3 c/B CFB dec | 23.23 ns/B 41.05 MiB/s 58.08 c/B OFB enc | 42.73 ns/B 22.32 MiB/s 106.8 c/B OFB dec | 42.73 ns/B 22.32 MiB/s 106.8 c/B CTR enc | 23.31 ns/B 40.92 MiB/s 58.27 c/B CTR dec | 23.35 ns/B 40.84 MiB/s 58.38 c/B Old (Intel Core i5-2450M): ========================== 3DES | nanosecs/byte mebibytes/sec cycles/byte ECB enc | 53.98 ns/B 17.67 MiB/s 134.9 c/B ECB dec | 54.00 ns/B 17.66 MiB/s 135.0 c/B CBC enc | 55.43 ns/B 17.20 MiB/s 138.6 c/B CBC dec | 54.27 ns/B 17.57 MiB/s 135.7 c/B CFB enc | 55.42 ns/B 17.21 MiB/s 138.6 c/B CFB dec | 54.35 ns/B 17.55 MiB/s 135.9 c/B OFB enc | 54.49 ns/B 17.50 MiB/s 136.2 c/B OFB dec | 54.49 ns/B 17.50 MiB/s 136.2 c/B CTR enc | 55.02 ns/B 17.33 MiB/s 137.5 c/B CTR dec | 55.01 ns/B 17.34 MiB/s 137.5 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2014-03-11Add MD2 message digest implementationDmitry Eremin-Solenikov2-0/+185
* cipher/md2.c: New. * cipher/md.c (digest_list): add _gcry_digest_spec_md2. * tests/basic.c (check_digests): add MD2 test vectors. * configure.ac (default_digests): disable md2 by default. -- Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> Some minor indentation fixes by wk.
2014-03-04Add a simple (raw) PKCS#1 padding modeDmitry Eremin-Solenikov3-0/+94
* src/cipher.h (PUBKEY_ENC_PKCS1_RAW): New. * cipher/pubkey-util.c (_gcry_pk_util_parse_flaglist): Handle pkcs1-raw flag. * cipher/pubkey-util.c (_gcry_pk_util_data_to_mpi): Handle s-exp like (data (flags pkcs1-raw) (value xxxxx)) * cipher/rsa-common.c (_gcry_rsa_pkcs1_encode_raw_for_sig): PKCS#1-encode data with embedded hash OID for signature verification. * tests/basic.c (check_pubkey_sign): Add tests for s-exps with pkcs1-raw flag. -- Allow user to specify (flags pkcs1-raw) to enable pkcs1 padding of raw value (no hash algorithm is specified). It is up to the user to verify that the passed value is properly formatted and includes DER-encoded ASN OID of the used hash function. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2014-01-29Fix RSA Blinding.NIIBE Yutaka1-5/+4
* cipher/rsa.c (rsa_decrypt): Loop to get multiplicative inverse. Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
2014-01-28cipher: Take care of ENABLE_NEON_SUPPORT.Werner Koch4-17/+20
* cipher/salsa20.c (USE_ARM_NEON_ASM): Define only if ENABLE_NEON_SUPPORT is defined. * cipher/serpent.c (USE_NEON): Ditto. * cipher/sha1.c (USE_NEON): Ditto. * cipher/sha512.c (USE_ARM_NEON_ASM): Ditto. -- The generic C source files must only include NEON support if that is enabled. The dedicated ASM files are conditionally compiled and thus do not need to use it. GnuPG-bug-id: 1603 Signed-off-by: Werner Koch <wk@gnupg.org>
2014-01-27Fix memory leaks in ecc codeDmitry Eremin-Solenikov2-5/+19
* cipher/ecc-curves.c (_gcry_ecc_update_curve_param): Release passed mpi values. * cipher/ecc.c (compute_keygrip): Fix potential memory leak in error path. * cipher/ecc.c (_gcry_ecc_get_curve): Release temporary mpi. -- ==11657== 252 (80 direct, 172 indirect) bytes in 4 blocks are definitely lost in loss record 8 of 8 ==11657== at 0x4028A28: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==11657== by 0x404178F: _gcry_private_malloc (stdmem.c:113) ==11657== by 0x403CED1: do_malloc.constprop.4 (global.c:768) ==11657== by 0x403DD01: _gcry_xmalloc (global.c:790) ==11657== by 0x409EAE0: _gcry_mpi_alloc (mpiutil.c:84) ==11657== by 0x409C4E4: _gcry_mpi_scan (mpicoder.c:466) ==11657== by 0x404009C: _gcry_sexp_nth_mpi (sexp.c:796) ==11657== by 0x40410B5: _gcry_sexp_vextract_param (sexp.c:2327) ==11657== by 0x4041396: _gcry_sexp_extract_param (sexp.c:2378) ==11657== by 0x407B895: compute_keygrip (ecc.c:1492) ==11657== by 0x404BBE8: _gcry_pk_get_keygrip (pubkey.c:674) ==11657== by 0x403B1BF: gcry_pk_get_keygrip (visibility.c:1056) ==16502== 144 (60 direct, 84 indirect) bytes in 3 blocks are definitely lost in loss record 3 of 7 ==16502== at 0x4028A28: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so) ==16502== by 0x404B4DE: _gcry_private_malloc (stdmem.c:113) ==16502== by 0x404667B: do_malloc (global.c:768) ==16502== by 0x40466E7: _gcry_malloc (global.c:790) ==16502== by 0x4046A55: _gcry_xmalloc (global.c:944) ==16502== by 0x40CD25B: _gcry_mpi_alloc (mpiutil.c:84) ==16502== by 0x40CAC3E: _gcry_mpi_scan (mpicoder.c:548) ==16502== by 0x40A72B2: scanval (ecc-curves.c:432) ==16502== by 0x40A7B0D: _gcry_ecc_get_curve (ecc-curves.c:685) ==16502== by 0x4058164: _gcry_pk_get_curve (pubkey.c:747) ==16502== by 0x4043E14: gcry_pk_get_curve (visibility.c:1067) ==16502== by 0x8048934: check_matching (curves.c:124) Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2014-01-27Fix number of blocks passed used in _gcry_rmd160_mixblockDmitry Eremin-Solenikov1-1/+1
* cipher/rmd160.c (_gcry_rmd160_mixblock): pass 1 to transform -- Currently _gcry_rmd160_mixblock() passes 64 as nblocks to transform() function, while passing only one block of data. This causes acess after the allocated data and tons of errors on each valgrind invokation. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> This fixes commit 50b8c834.
2014-01-20cipher: Fix commit 94030e44Werner Koch1-3/+9
* cipher/tiger.c (tiger_init): Add arg FLAGS. (tiger1_init, tiger2_init): Ditto.
2014-01-19md: Add Whirlpool bug emulation feature.Werner Koch11-46/+216
* src/gcrypt.h.in (GCRY_MD_FLAG_BUGEMU1): New. * src/cipher-proto.h (gcry_md_init_t): Add arg FLAGS. Change all code to implement that flag. * cipher/md.c (gcry_md_context): Replace SECURE and FINALIZED by bit field FLAGS. Add flag BUGEMU1. Change all users. (md_open): Replace args SECURE and HMAC by FLAGS. Init flags.bugemu1. (_gcry_md_open): Add for GCRY_MD_FLAG_BUGEMU1. (md_enable): Pass bugemu1 flag to the hash init function. (_gcry_md_reset): Ditto. -- This problem is for example exhibited in the Linux cryptsetup tool. See https://bbs.archlinux.org/viewtopic.php?id=175737 . It has be been tracked down by Milan Broz. The suggested way of using the flag is: if (whirlpool_bug_assumed) { #if GCRYPT_VERSION_NUMBER >= 0x010601 err = gcry_md_open (&hd, GCRY_MD_WHIRLPOOL, GCRY_MD_FLAG_BUGEMU1) if (gpg_err_code (err) == GPG_ERR_INV_ARG) error ("Need at least Libggcrypt 1.6.1 for the fix"); else { do_hash (hd); gcry_md_close (hd); } #endif } Signed-off-by: Werner Koch <wk@gnupg.org>
2014-01-16Replace ath based mutexes by gpgrt based locks.Werner Koch12-52/+27
* configure.ac (NEED_GPG_ERROR_VERSION): Require 1.13. (gl_LOCK): Remove. * src/ath.c, src/ath.h: Remove. Remove from all files. Replace all mutexes by gpgrt based statically initialized locks. * src/global.c (global_init): Remove ath_init. (_gcry_vcontrol): Make ath install a dummy function. (print_config): Remove threads info line. * doc/gcrypt.texi: Simplify the multi-thread related documentation. -- The current code does only work on ELF systems with weak symbol support. In particular no locks were used under Windows. With the new gpgrt_lock functions from the soon to be released libgpg-error 1.13 we have a better portable scheme which also allows for static initialized mutexes. Signed-off-by: Werner Koch <wk@gnupg.org>
2014-01-14PBKDF2: Use gcry_md_reset to speed up calculation.Milan Broz1-7/+9
* cipher/kdf.c (_gcry_kdf_pkdf2): Use gcry_md_reset to speed up calculation. -- Current PBKDF2 implementation uses gcry_md_set_key in every iteration which is extremely slow (even in comparison with other implementations). Use gcry_md_reset instead and set key only once. With this test program: char input[32000], salt[8], key[16]; gcry_kdf_derive(input, sizeof(input), GCRY_KDF_PBKDF2, gcry_md_map_name("sha1"), salt, sizeof(salt), 100000, sizeof(key), key); running time without patch: real 0m11.165s user 0m11.136s sys 0m0.000s and with patch applied real 0m0.230s user 0m0.184s sys 0m0.024s (The problem was found when cryptsetup started to use gcrypt internal PBKDF2 and for very long keyfiles unlocking time increased drastically. See https://bugzilla.redhat.com/show_bug.cgi?id=1051733) Signed-off-by: Milan Broz <gmazyland@gmail.com>
2014-01-13Fix macro conflict in NetBSDWerner Koch1-9/+11
* cipher/bithelp.h (bswap32): Rename to _gcry_bswap32. (bswap64): Rename to _gcry_bswap64. -- NetBSD provides system macros bswap32 and bswap64 which conflicts with our macros. Prefixing them with _gcry_ is easier than to come up with a proper test. GnuPG-bug-id: 1600 Signed-off-by: Werner Koch <wk@gnupg.org> (cherry picked from commit 36214bfa8f612cd2faa4de217d1a12a8b5faadbf)
2014-01-13Truncate hash values for ECDSA signature schemeDmitry Eremin-Solenikov4-62/+83
* cipher/dsa-common (_gcry_dsa_normalize_hash): New. Truncate opaque mpis as required for DSA and ECDSA signature schemas. * cipher/dsa.c (verify): Return gpg_err_code_t value from verify() to behave like the rest of internal sign/verify functions. * cipher/dsa.c (sign, verify, dsa_verify): Factor out hash truncation. * cipher/ecc-ecdsa.c (_gcry_ecc_ecdsa_sign): Factor out hash truncation. * cipher/ecc-ecdsa.c (_gcry_ecc_ecdsa_verify): as required by ECDSA scheme, truncate hash values to bitlength of used curve. * tests/pubkey.c (check_ecc_sample_key): add a testcase for hash truncation. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2014-01-13Add GOST R 34.10-2012 curves proposed by TC26Dmitry Eremin-Solenikov1-0/+35
* cipher/ecc-curves.c (domain_parmss): Add two GOST R 34.10-2012 curves proposed/pending to standardization by TC26 (Russian cryptography technical comitee). * cipher/ecc-curves.c (curve_alias): Add OID aliases. * tests/curves.c: Increase N_CURVES. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>