summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2013-11-06Speed-up AES-NI key setupJussi Kivilinna1-99/+300
* cipher/rijndael.c [USE_AESNI] (m128i_t): Remove. [USE_AESNI] (u128_t): New. [USE_AESNI] (aesni_do_setkey): New. (do_setkey) [USE_AESNI]: Move AES-NI accelerated key setup to 'aesni_do_setkey'. (do_setkey): Call _gcry_get_hw_features only once. Clear stack after use in generic key setup part. (rijndael_setkey): Remove stack burning. (prepare_decryption) [USE_AESNI]: Use 'u128_t' instead of 'm128i_t' to avoid compiler generated SSE2 instructions and XMM register usage, unroll 'aesimc' setup loop (prepare_decryption): Clear stack after use. [USE_AESNI] (do_aesni_enc_aligned): Update comment about alignment. (do_decrypt): Do not burning stack after prepare_decryption. -- Patch improves the speed of AES key setup with AES-NI instructions. Patch also removes problematic the use of vector typedef, which might cause interference with XMM register usage in AES-NI accelerated code. New: $ tests/benchmark --cipher-with-keysetup --cipher-repetitions 1000 cipher aes aes192 aes256 Running each test 1000 times. ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- AES 520ms 590ms 1760ms 310ms 1640ms 300ms 1620ms 1610ms 350ms 360ms 2160ms 2140ms AES192 640ms 680ms 2030ms 370ms 1920ms 350ms 1890ms 1880ms 400ms 410ms 2490ms 2490ms AES256 730ms 780ms 2330ms 430ms 2210ms 420ms 2170ms 2180ms 470ms 480ms 2830ms 2840ms Old: $ tests/benchmark --cipher-with-keysetup --cipher-repetitions 1000 cipher aes aes192 aes256 Running each test 1000 times. ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- AES 670ms 740ms 1910ms 470ms 1790ms 470ms 1770ms 1760ms 520ms 510ms 2310ms 2310ms AES192 820ms 860ms 2220ms 550ms 2110ms 540ms 2070ms 2070ms 600ms 590ms 2670ms 2680ms AES256 920ms 970ms 2510ms 620ms 2390ms 600ms 2360ms 2370ms 650ms 660ms 3020ms 3020ms Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-11-06Avoid burn stack in Arcfour setkeyJussi Kivilinna1-1/+0
* cipher/arcfour.c (arcfour_setkey): Remove stack burning. -- Stack is already cleared in do_arcfour_setkey and GCC is inlining do_arcfour_setkey to arcfour_setkey which renders this _gcry_burn_stack broken anyways. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-11-06Avoid burn_stack in CAST5 setkeyJussi Kivilinna1-4/+3
* cipher/cast5.c (do_cast_setkey): Use wipememory instead of memset. (cast_setkey): Remove stack burning. -- Burning stack does not work properly when compiler inlines static functions, therefore use wipememory to clear stack after use instead of relying on _gcry_burn_stack. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-11-06Improve Serpent key setup speedJussi Kivilinna1-72/+62
* cipher/serpent.c (SBOX, SBOX_INVERSE): Remove index argument. (serpent_subkeys_generate): Use smaller temporary arrays for subkey generation and perform stack clearing locally. (serpent_setkey_internal): Use wipememory to clear stack and remove _gcry_burn_stack. (serpent_setkey): Remove unneeded _gcry_burn_stack. -- Avoid using large arrays and large stack burning to gain extra speed for key setup. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-11-06Modify encrypt/decrypt arguments for in-placeJussi Kivilinna1-6/+12
* cipher/cipher.c (gcry_cipher_encrypt, gcry_cipher_decrypt): Modify local arguments if in-place operation. -- Modify encrypt/decrypt argument variables instead of calling subfunction with different arguments. This allows compiler to inline the subfunction for small speedup. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-11-06Speed up StribogJussi Kivilinna1-1152/+1070
* cipher/stribog.c (STRIBOG_TABLES): Remove. (Pi): Remove. [!STRIBOG_TABLES] (A, strido): Remove. (stribog_table): New table pre-reordered with Pi values. (strido): Rewrite for new table. (LPSX): Rewrite for new table. (xor): Remove. (g): Small tweaks. -- Patch optimizes the table-lookup implementation a bit. Patch also removes the unused non-table implementation from source. On Intel Core i5-4570 (amd64, 3.2Ghz): After: | nanosecs/byte mebibytes/sec cycles/byte STRIBOG256 | 9.22 ns/B 103.4 MiB/s 29.53 c/B STRIBOG512 | 9.23 ns/B 103.4 MiB/s 29.53 c/B Before: | nanosecs/byte mebibytes/sec cycles/byte STRIBOG256 | 30.17 ns/B 31.61 MiB/s 96.56 c/B STRIBOG512 | 30.20 ns/B 31.57 MiB/s 96.68 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-11-06Tweak AES-NI bulk CTR mode slightlyJussi Kivilinna1-38/+45
* cipher/rijndael.c [USE_AESNI] (aesni_cleanup_2_5): Rename to... (aesni_cleanup_2_6): ...this and clear also 'xmm6'. [USE_AESNI && __i386__] (do_aesni_ctr, do_aesni_ctr_4): Prevent inlining only on i386, allow on AMD64. [USE_AESNI] (do_aesni_ctr, do_aesni_ctr_4): Use counter block from 'xmm5' and byte-swap mask from 'xmm6'. (_gcry_aes_ctr_enc) [USE_AESNI]: Preload counter block to 'xmm5' and byte-swap mask to 'xmm6'. (_gcry_aes_ctr_enc, _gcry_aes_cfb_dec, _gcry_aes_cbc_dec): Use 'aesni_cleanup_2_6'. -- Small tweak that yeilds ~5% more speed on Intel Core i5-4570. After: AES | nanosecs/byte mebibytes/sec cycles/byte CTR enc | 0.274 ns/B 3482.5 MiB/s 0.877 c/B CTR dec | 0.274 ns/B 3486.8 MiB/s 0.876 c/B Before: AES | nanosecs/byte mebibytes/sec cycles/byte CTR enc | 0.288 ns/B 3312.5 MiB/s 0.922 c/B CTR dec | 0.288 ns/B 3312.6 MiB/s 0.922 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-11-06Tweak bench-slope parametersJussi Kivilinna1-3/+3
* tests/bench-slope.c (BUF_STEP_SIZE): Half step size to 64. (NUM_MEASUREMENT_REPETITIONS): Double repetitions to 64. -- Tweak parameters for better repeatability of results with fast ciphers (AES-NI). Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-11-06Optimize Blowfish weak key checkJussi Kivilinna1-11/+90
* cipher/blowfish.c (hashset_elem, val_to_hidx, add_val): New. (do_bf_setkey): Use faster algorithm for detecting weak keys. (bf_setkey): Move stack burning to do_bf_setkey. -- Patch optimizes the weak key check for Blowfish. Instead of iterating through sbox-tables for duplicates, insert values to hash-set and detect collisions. Old check code was taking slightly longer time than the actual key setup of Blowfish, which by itself is already quite slow. After: $ tests/benchmark --cipher-with-keysetup --cipher-repetitions 10 cipher blowfish Running each test 10 times. ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- BLOWFISH 410ms 440ms 430ms 370ms 440ms 370ms 430ms 440ms 370ms 370ms - - Before: $ tests/benchmark --cipher-with-keysetup --cipher-repetitions 10 cipher blowfish Running each test 10 times. ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- BLOWFISH 780ms 770ms 780ms 730ms 780ms 730ms 780ms 790ms 720ms 730ms - - Without key-setup: $ tests/benchmark --cipher-repetitions 10 cipher blowfish Running each test 10 times. ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- BLOWFISH 70ms 70ms 80ms 30ms 80ms 30ms 80ms 90ms 20ms 30ms - - Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-11-06Fix __builtin_bswap32/64 checksJussi Kivilinna1-4/+4
* configure.ac (gcry_cv_have_builtin_bswap32) (gcry_cv_have_builtin_bswap64): Change compile checks to link checks. -- Patch changes compile checks to link checks for __builtin_bswap(32|64). Compiling obviously works with missing functions, linking not so much. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-11-06Fix 'u32' build error with CamelliaJussi Kivilinna1-3/+3
* cipher/camellia.c: Add include for <config.h> and "types.h". (u32): Remove. (u8): Typedef as 'byte'. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-11-06pubkey: Add forward compatibility feature.Werner Koch1-8/+15
* cipher/pubkey-util.c (_gcry_pk_util_parse_flaglist): Add "igninvflag". -- If future versions of Libgcrypt want to add optional flags to a pubkey s-expression, they may use the "igninvflag" flag to make the flag parser ignore flags it does not know about. Signed-off-by: Werner Koch <wk@gnupg.org>
2013-11-05ecc: Require "eddsa" flag for curve Ed25519.Werner Koch10-59/+40
* src/cipher.h (PUBKEY_FLAG_ECDSA): Remove. * cipher/pubkey-util.c (_gcry_pk_util_parse_flaglist): Remove "ecdsa". * cipher/ecc.c (ecc_generate, ecc_sign, ecc_verify): Require "eddsa" flag. * cipher/ecc-misc.c (_gcry_ecc_compute_public): Depend "eddsa" flag. * tests/benchmark.c, tests/keygen.c, tests/pubkey.c * tests/t-ed25519.c, tests/t-mpi-point.c: Adjust for changed flags. -- This changes make using ECDSA signatures the default for all curves. If another signing algorithm is to be used, the corresponding flag needs to be given. In particular the flags "eddsa" is now always required with curve Ed25519 to comply with the specs. This change makes the code better readable by not assuming a certain signature algorithm depending on the curve. Signed-off-by: Werner Koch <wk@gnupg.org>
2013-11-05ecc: Fully implement Ed25519 compression in ECDSA mode.Werner Koch10-117/+159
* src/ec-context.h (mpi_ec_ctx_s): Add field FLAGS. * mpi/ec.c (ec_p_init): Add arg FLAGS. Change all callers to pass it. * cipher/ecc-curves.c (point_from_keyparam): Add arg EC, parse as opaque mpi and use eddsa decoding depending on the flag. (_gcry_mpi_ec_new): Rearrange to parse Q and D after knowing the curve. Signed-off-by: Werner Koch <wk@gnupg.org>
2013-11-05mpi: Add function gcry_mpi_set_opaque_copy.Werner Koch8-4/+48
* src/gcrypt.h.in (gcry_mpi_set_opaque_copy): New. * src/visibility.c (gcry_mpi_set_opaque_copy): New. * src/visibility.h (gcry_mpi_set_opaque_copy): Mark visible. * src/libgcrypt.def, src/libgcrypt.vers: Add new API. * tests/mpitests.c (test_opaque): Add test. Signed-off-by: Werner Koch <wk@gnupg.org>
2013-11-04Make test vectors 'static const'Jussi Kivilinna7-35/+42
* cipher/arcfour.c (selftest): Change test vectors to 'static const'. * cipher/blowfish.c (selftest): Ditto. * cipher/camellia-glue.c (selftest): Ditto. * cipher/cast5.c (selftest): Ditto. * cipher/des.c (selftest): Ditto. * cipher/rijndael.c (selftest): Ditto. * tests/basic.c (cipher_cbc_mac_cipher, check_aes128_cbc_cts_cipher) (check_ctr_cipher, check_cfb_cipher, check_ofb_cipher) (check_ccm_cipher, check_stream_cipher) (check_stream_cipher_large_block, check_bulk_cipher_modes) (check_ciphers, check_digests, check_hmac, check_pubkey_sign) (check_pubkey_sign_ecdsa, check_pubkey_crypt, check_pubkey): Ditto. -- Some test vectors have been defined without 'static' and thus end up being initialized on runtime. Change these to 'static'. Also change test vectors const where possible. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-11-03Make jump labels local in Salsa20 assemblyJussi Kivilinna2-45/+45
* cipher/salsa20-amd64.S: Rename '._labels' to '.L_labels'. * cipher/salsa20-armv7-neon.S: Ditto. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-10-30bithelp: fix undefined behaviour with rol and rorJussi Kivilinna1-3/+3
* cipher/bithelp.h (rol, ror): Mask shift with 31. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-10-29tests: Add feature to skip benchmarks.Werner Koch2-10/+20
* tests/benchmark.c (main): Add feature to skip the test. * tests/bench-slope.c (main): Ditto. (get_slope): Repace C++ style comment. (double_cmp, cipher_bench, _hash_bench): Repalce system reserved symbols. -- During development a quick run of the regression is often useful, however the benchmarks take a lot of time and thus this feature allows to skip theses tests. Signed-off-by: Werner Koch <wk@gnupg.org>
2013-10-29ecc: Finish Ed25519/ECDSA hack.Werner Koch2-9/+41
* cipher/ecc.c (ecc_generate): Fix Ed25519/ECDSA case. (ecc_verify): Implement ED25519/ECDSA uncompression. -- With this change Ed25519 may be used with ECDSA while using the Ed25519 standard compression technique. Signed-off-by: Werner Koch <wk@gnupg.org>
2013-10-29Typo fix.Werner Koch1-1/+1
--
2013-10-29ecc: Add flags "noparam" and "comp".Werner Koch5-84/+201
* src/cipher.h (PUBKEY_FLAG_NOPARAM, PUBKEY_FLAG_COMP): New. * cipher/pubkey-util.c (_gcry_pk_util_parse_flaglist): Parse new flags and change code for possible faster parsing. * cipher/ecc.c (ecc_generate): Implement the "noparam" flag. (ecc_sign): Ditto. (ecc_verify): Ditto. * tests/keygen.c (check_ecc_keys): Use the "noparam" flag. * cipher/ecc.c (ecc_generate): Fix parsing of the deprecated transient-flag parameter. (ecc_verify): Do not make Q optional in the extract-param call. -- Note that the "comp" flag has not yet any effect. Signed-off-by: Werner Koch <wk@gnupg.org>
2013-10-28Fix typos in documentationJussi Kivilinna1-43/+43
* doc/gcrypt.texi: Fix some typos. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-10-28Add ARM NEON assembly implementation of SerpentJussi Kivilinna4-1/+997
* cipher/Makefile.am: Add 'serpent-armv7-neon.S'. * cipher/serpent-armv7-neon.S: New. * cipher/serpent.c (USE_NEON): New macro. (serpent_context_t) [USE_NEON]: Add 'use_neon'. [USE_NEON] (_gcry_serpent_neon_ctr_enc, _gcry_serpent_neon_cfb_dec) (_gcry_serpent_neon_cbc_dec): New prototypes. (serpent_setkey_internal) [USE_NEON]: Detect NEON support. (_gcry_serpent_neon_ctr_enc, _gcry_serpent_neon_cfb_dec) (_gcry_serpent_neon_cbc_dec) [USE_NEON]: Use NEON implementations to process eight blocks in parallel. * configure.ac [neonsupport]: Add 'serpent-armv7-neon.lo'. -- Patch adds ARM NEON optimized implementation of Serpent cipher to speed up parallelizable bulk operations. Benchmarks on ARM Cortex-A8 (armhf, 1008 Mhz): Old: SERPENT128 | nanosecs/byte mebibytes/sec cycles/byte CBC dec | 43.53 ns/B 21.91 MiB/s 43.88 c/B CFB dec | 44.77 ns/B 21.30 MiB/s 45.13 c/B CTR enc | 45.21 ns/B 21.10 MiB/s 45.57 c/B CTR dec | 45.21 ns/B 21.09 MiB/s 45.57 c/B New: SERPENT128 | nanosecs/byte mebibytes/sec cycles/byte CBC dec | 26.26 ns/B 36.32 MiB/s 26.47 c/B CFB dec | 26.21 ns/B 36.38 MiB/s 26.42 c/B CTR enc | 26.20 ns/B 36.40 MiB/s 26.41 c/B CTR dec | 26.20 ns/B 36.40 MiB/s 26.41 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-10-28Add ARM NEON assembly implementation of Salsa20Jussi Kivilinna4-10/+1027
* cipher/Makefile.am: Add 'salsa20-armv7-neon.S'. * cipher/salsa20-armv7-neon.S: New. * cipher/salsa20.c [USE_ARM_NEON_ASM]: New macro. (struct SALSA20_context_s, salsa20_core_t, salsa20_keysetup_t) (salsa20_ivsetup_t): New. (SALSA20_context_t) [USE_ARM_NEON_ASM]: Add 'use_neon'. (SALSA20_context_t): Add 'keysetup', 'ivsetup' and 'core'. (salsa20_core): Change 'src' argument to 'ctx'. [USE_ARM_NEON_ASM] (_gcry_arm_neon_salsa20_encrypt): New prototype. [USE_ARM_NEON_ASM] (salsa20_core_neon, salsa20_keysetup_neon) (salsa20_ivsetup_neon): New. (salsa20_do_setkey): Setup keysetup, ivsetup and core with default functions. (salsa20_do_setkey) [USE_ARM_NEON_ASM]: When NEON support detect, set keysetup, ivsetup and core with ARM NEON functions. (salsa20_do_setkey): Call 'ctx->keysetup'. (salsa20_setiv): Call 'ctx->ivsetup'. (salsa20_do_encrypt_stream) [USE_ARM_NEON_ASM]: Process large buffers in ARM NEON implementation. (salsa20_do_encrypt_stream): Call 'ctx->core' instead of directly calling 'salsa20_core'. (selftest): Add test to check large buffer processing and block counter updating. * configure.ac [neonsupport]: 'Add salsa20-armv7-neon.lo'. -- Patch adds fast ARM NEON assembly implementation for Salsa20. Implementation gains extra speed by processing three blocks in parallel with help of ARM NEON vector processing unit. This implementation is based on public domain code by Peter Schwabe and D. J. Bernstein and it is available in SUPERCOP benchmarking framework. For more details on this work, check paper "NEON crypto" by Daniel J. Bernstein and Peter Schwabe: http://cryptojedi.org/papers/#neoncrypto Benchmark results on Cortex-A8 (1008 Mhz): Before: SALSA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 18.88 ns/B 50.51 MiB/s 19.03 c/B STREAM dec | 18.89 ns/B 50.49 MiB/s 19.04 c/B = SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 13.60 ns/B 70.14 MiB/s 13.71 c/B STREAM dec | 13.60 ns/B 70.13 MiB/s 13.71 c/B After: SALSA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 5.48 ns/B 174.1 MiB/s 5.52 c/B STREAM dec | 5.47 ns/B 174.2 MiB/s 5.52 c/B = SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 3.65 ns/B 260.9 MiB/s 3.68 c/B STREAM dec | 3.65 ns/B 261.6 MiB/s 3.67 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-10-28Add AMD64 assembly implementation of Salsa20Jussi Kivilinna4-74/+1056
* cipher/Makefile.am: Add 'salsa20-amd64.S'. * cipher/salsa20-amd64.S: New. * cipher/salsa20.c (USE_AMD64): New macro. [USE_AMD64] (_gcry_salsa20_amd64_keysetup, _gcry_salsa20_amd64_ivsetup) (_gcry_salsa20_amd64_encrypt_blocks): New prototypes. [USE_AMD64] (salsa20_keysetup, salsa20_ivsetup, salsa20_core): New. [!USE_AMD64] (salsa20_core): Change 'src' to non-constant, update block counter in 'salsa20_core' and return burn stack depth. [!USE_AMD64] (salsa20_keysetup, salsa20_ivsetup): New. (salsa20_do_setkey): Move generic key setup to 'salsa20_keysetup'. (salsa20_setkey): Fix burn stack depth. (salsa20_setiv): Move generic IV setup to 'salsa20_ivsetup'. (salsa20_do_encrypt_stream) [USE_AMD64]: Process large buffers in AMD64 implementation. (salsa20_do_encrypt_stream): Move stack burning to this function... (salsa20_encrypt_stream, salsa20r12_encrypt_stream): ...from these functions. * configure.ac [x86-64]: Add 'salsa20-amd64.lo'. -- Patch adds fast AMD64 assembly implementation for Salsa20. This implementation is based on public domain code by D. J. Bernstein and it is available at http://cr.yp.to/snuffle.html (amd64-xmm6). Implementation gains extra speed by processing four blocks in parallel with help SSE2 instructions. Benchmark results on Intel Core i5-4570 (3.2 Ghz): Before: SALSA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 3.88 ns/B 246.0 MiB/s 12.41 c/B STREAM dec | 3.88 ns/B 246.0 MiB/s 12.41 c/B = SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 2.46 ns/B 387.9 MiB/s 7.87 c/B STREAM dec | 2.46 ns/B 387.7 MiB/s 7.87 c/B After: SALSA20 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 0.985 ns/B 967.8 MiB/s 3.15 c/B STREAM dec | 0.987 ns/B 966.5 MiB/s 3.16 c/B = SALSA20R12 | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 0.636 ns/B 1500.5 MiB/s 2.03 c/B STREAM dec | 0.636 ns/B 1499.2 MiB/s 2.04 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-10-28Add new benchmarking utility, bench-slopeJussi Kivilinna2-2/+1174
* tests/Makefile.am (TESTS): Add 'bench-slope'. * tests/bench-slope.c: New. -- Bench-slope is new benchmarking tool for libgcrypt for obtaining overheadless cycles/byte speed of cipher and hash algorithms. Tool measures the time each operation (hash/encrypt/decrypt/authentication) takes for different buffer sizes of from ~0kB to ~4kB and calculates the slope for these data points. The default output is then given as nanosecs/byte and mebibytes/sec. If user provides the speed of used CPU, tool also outputs cycles/byte result (CPU-Ghz * ns/B = c/B). Output without CPU speed (with ARM Cortex-A8): $ tests/bench-slope hash Hash: | nanosecs/byte mebibytes/sec cycles/byte MD5 | 7.35 ns/B 129.7 MiB/s - c/B SHA1 | 12.30 ns/B 77.53 MiB/s - c/B RIPEMD160 | 15.96 ns/B 59.77 MiB/s - c/B TIGER192 | 55.55 ns/B 17.17 MiB/s - c/B SHA256 | 24.38 ns/B 39.12 MiB/s - c/B SHA384 | 34.24 ns/B 27.86 MiB/s - c/B SHA512 | 34.19 ns/B 27.90 MiB/s - c/B SHA224 | 24.38 ns/B 39.12 MiB/s - c/B MD4 | 5.68 ns/B 168.0 MiB/s - c/B CRC32 | 9.26 ns/B 103.0 MiB/s - c/B CRC32RFC1510 | 9.20 ns/B 103.6 MiB/s - c/B CRC24RFC2440 | 87.31 ns/B 10.92 MiB/s - c/B WHIRLPOOL | 253.3 ns/B 3.77 MiB/s - c/B TIGER | 55.55 ns/B 17.17 MiB/s - c/B TIGER2 | 55.55 ns/B 17.17 MiB/s - c/B GOSTR3411_94 | 212.0 ns/B 4.50 MiB/s - c/B STRIBOG256 | 630.1 ns/B 1.51 MiB/s - c/B STRIBOG512 | 630.1 ns/B 1.51 MiB/s - c/B = With CPU speed (with Intel i5-4570, 3.2Ghz when turbo-boost disabled): $ tests/bench-slope --cpu-mhz 3201 cipher arcfour blowfish aes Cipher: ARCFOUR | nanosecs/byte mebibytes/sec cycles/byte STREAM enc | 2.43 ns/B 392.1 MiB/s 7.79 c/B STREAM dec | 2.44 ns/B 390.2 MiB/s 7.82 c/B = BLOWFISH | nanosecs/byte mebibytes/sec cycles/byte ECB enc | 7.62 ns/B 125.2 MiB/s 24.38 c/B ECB dec | 7.63 ns/B 125.0 MiB/s 24.43 c/B CBC enc | 9.18 ns/B 103.9 MiB/s 29.38 c/B CBC dec | 2.60 ns/B 366.2 MiB/s 8.34 c/B CFB enc | 9.17 ns/B 104.0 MiB/s 29.35 c/B CFB dec | 2.66 ns/B 358.1 MiB/s 8.53 c/B OFB enc | 8.97 ns/B 106.3 MiB/s 28.72 c/B OFB dec | 8.97 ns/B 106.3 MiB/s 28.71 c/B CTR enc | 2.60 ns/B 366.5 MiB/s 8.33 c/B CTR dec | 2.60 ns/B 367.1 MiB/s 8.32 c/B = AES | nanosecs/byte mebibytes/sec cycles/byte ECB enc | 0.439 ns/B 2173.0 MiB/s 1.40 c/B ECB dec | 0.489 ns/B 1949.5 MiB/s 1.57 c/B CBC enc | 1.64 ns/B 580.8 MiB/s 5.26 c/B CBC dec | 0.219 ns/B 4357.6 MiB/s 0.701 c/B CFB enc | 1.53 ns/B 623.6 MiB/s 4.90 c/B CFB dec | 0.219 ns/B 4350.5 MiB/s 0.702 c/B OFB enc | 1.51 ns/B 629.9 MiB/s 4.85 c/B OFB dec | 1.51 ns/B 629.9 MiB/s 4.85 c/B CTR enc | 0.288 ns/B 3308.5 MiB/s 0.923 c/B CTR dec | 0.288 ns/B 3316.9 MiB/s 0.920 c/B CCM enc | 1.93 ns/B 493.8 MiB/s 6.18 c/B CCM dec | 1.93 ns/B 494.0 MiB/s 6.18 c/B CCM auth | 1.64 ns/B 580.1 MiB/s 5.26 c/B = Note: It's highly recommented to disable turbo-boost and dynamic CPU frequency features when making these kind of measurements to reduce variance. Note: The results are maximum performance for each operation; the actual speed in application depends on various matters, such as: used buffer sizes, cache usage, etc. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-10-28Change .global to .globl in assembly filesJussi Kivilinna11-32/+32
* cipher/blowfish-arm.S: Change '.global' to '.globl'. * cipher/camellia-aesni-avx-amd64.S: Ditto. * cipher/camellia-aesni-avx2-amd64.S: Ditto. * cipher/camellia-arm.S: Ditto. * cipher/cast5-amd64.S: Ditto. * cipher/rijndael-amd64.S: Ditto. * cipher/rijndael-arm.S: Ditto. * cipher/serpent-avx2-amd64.S: Ditto. * cipher/serpent-sse2-amd64.S: Ditto. * cipher/twofish-amd64.S: Ditto. * cipher/twofish-arm.S: Ditto. -- The .global keyword is used only in newer versions of GAS, so change these to older .globl for better portability. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-10-26Deduplicate code for ECB encryption and decryptionJussi Kivilinna1-30/+14
* cipher/cipher.c (do_ecb_crypt): New, based on old 'do_ecb_encrypt'. (do_ecb_encrypt): Use 'do_ecb_crypt', pass encryption function. (do_ecb_decrypt): Use 'do_ecb_crypt', pass decryption function. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-10-26Drop _gcry_cipher_ofb_decrypt as it duplicates _gcry_cipher_ofb_encryptDmitry Eremin-Solenikov3-74/+1
* cipher/cipher.c (cipher_decrypt): Use _gcry_cipher_ofb_encrypt for OFB decryption. * cipher/cipher-internal.h: Remove _gcry_cipher_ofb_decrypt declaration. * cipher/cipher-ofb.c (_gcry_cipher_ofb_decrypt): Remove. (_gcry_cipher_ofb_encrypt): remove copying of IV to lastiv, it's unused there. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2013-10-25tests: Add tests for mpi_cmp.Werner Koch1-20/+144
* tests/mpitests.c (die): Modernize. (fail): New. (test_opaque, test_add, test_sub, test_mul): Use gcry_log_xx (main): Return error count. (test_cmp): New. Signed-off-by: Werner Koch <wk@gnupg.org>
2013-10-24ecc: Change algorithm for Ed25519 x recovery.Werner Koch4-60/+111
* cipher/ecc-eddsa.c (scanval): Add as temporary hack. (_gcry_ecc_eddsa_recover_x): Use the algorithm from page 15 of the paper. Return an error code. (_gcry_ecc_eddsa_decodepoint): Take care of the error code. * mpi/mpi-mul.c (gcry_mpi_mulm): Use truncated division. Signed-off-by: Werner Koch <wk@gnupg.org>
2013-10-24ecc: Refactor _gcry_ecc_eddsa_decodepoint.Werner Koch2-53/+62
* cipher/ecc-eddsa.c (_gcry_ecc_eddsa_decodepoint): Factor some code out to .. (_gcry_ecc_eddsa_recover_x): new. Signed-off-by: Werner Koch <wk@gnupg.org>
2013-10-24ecc-gost: Add missing includeJussi Kivilinna1-0/+1
* ecc-gost.c: Include "pubkey-internal.h". -- Patch fixes compiler warning: ecc-gost.c: In function '_gcry_ecc_gost_sign': ecc-gost.c:95:11: warning: implicit declaration of function '_gcry_dsa_gen_k' [-Wimplicit-function-declaration] k = _gcry_dsa_gen_k (skey->E.n, GCRY_STRONG_RANDOM); ^ Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-10-23Replace architecture specific fast_wipememory2 with genericJussi Kivilinna1-60/+25
* src/g10lib.h (fast_wipememory2): Remove architecture specific implementations and add generic implementation. -- Reduce code size, adds support for other architectures and gcc appears to generated better code without assembly parts. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-10-23Improve the speed of the cipher mode codeJussi Kivilinna13-148/+188
* cipher/bufhelp.h (buf_cpy): New. (buf_xor, buf_xor_2dst): If buffers unaligned, always jump to per-byte processing. (buf_xor_n_copy_2): New. (buf_xor_n_copy): Use 'buf_xor_n_copy_2'. * cipher/blowfish.c (_gcry_blowfish_cbc_dec): Avoid extra memory copy and use new 'buf_xor_n_copy_2'. * cipher/camellia-glue.c (_gcry_camellia_cbc_dec): Ditto. * cipher/cast5.c (_gcry_cast_cbc_dec): Ditto. * cipher/serpent.c (_gcry_serpent_cbc_dec): Ditto. * cipher/twofish.c (_gcry_twofish_cbc_dec): Ditto. * cipher/rijndael.c (_gcry_aes_cbc_dec): Ditto. (do_encrypt, do_decrypt): Use 'buf_cpy' instead of 'memcpy'. (_gcry_aes_cbc_enc): Avoid copying IV, use 'last_iv' pointer instead. * cipher/cipher-cbc.c (_gcry_cipher_cbc_encrypt): Avoid copying IV, update pointer to IV instead. (_gcry_cipher_cbc_decrypt): Avoid extra memory copy and use new 'buf_xor_n_copy_2'. (_gcry_cipher_cbc_encrypt, _gcry_cipher_cbc_decrypt): Avoid extra accesses to c->spec, use 'buf_cpy' instead of memcpy. * cipher/cipher-ccm.c (do_cbc_mac): Ditto. * cipher/cipher-cfb.c (_gcry_cipher_cfb_encrypt) (_gcry_cipher_cfb_decrypt): Ditto. * cipher/cipher-ctr.c (_gcry_cipher_ctr_encrypt): Ditto. * cipher/cipher-ofb.c (_gcry_cipher_ofb_encrypt) (_gcry_cipher_ofb_decrypt): Ditto. * cipher/cipher.c (do_ecb_encrypt, do_ecb_decrypt): Ditto. -- Patch improves the speed of the generic block cipher mode code. Especially on targets without faster unaligned memory accesses, the generic code was slower than the algorithm specific bulk versions. With this patch, this issue should be solved. Tests on Cortex-A8; compiled for ARMv4, without unaligned-accesses: Before: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 490ms 500ms 560ms 580ms 530ms 540ms 560ms 560ms 550ms 540ms 1080ms 1080ms TWOFISH 230ms 230ms 290ms 300ms 260ms 240ms 290ms 290ms 240ms 240ms 520ms 510ms DES 720ms 720ms 800ms 860ms 770ms 770ms 810ms 820ms 770ms 780ms - - CAST5 340ms 340ms 440ms 250ms 390ms 250ms 440ms 430ms 260ms 250ms - - After: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 500ms 490ms 520ms 520ms 530ms 520ms 530ms 540ms 500ms 520ms 1060ms 1070ms TWOFISH 230ms 220ms 250ms 230ms 260ms 230ms 260ms 260ms 230ms 230ms 500ms 490ms DES 720ms 720ms 750ms 760ms 740ms 750ms 770ms 770ms 760ms 760ms - - CAST5 340ms 340ms 370ms 250ms 370ms 250ms 380ms 390ms 250ms 250ms - - Tests on Cortex-A8; compiled for ARMv7-A, with unaligned-accesses: Before: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 430ms 440ms 480ms 530ms 470ms 460ms 490ms 480ms 470ms 460ms 930ms 940ms TWOFISH 220ms 220ms 250ms 230ms 240ms 230ms 270ms 250ms 230ms 240ms 480ms 470ms DES 550ms 540ms 620ms 690ms 570ms 540ms 630ms 650ms 590ms 580ms - - CAST5 300ms 300ms 380ms 230ms 330ms 230ms 380ms 370ms 230ms 230ms - - After: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 430ms 430ms 460ms 450ms 460ms 450ms 470ms 470ms 460ms 470ms 900ms 930ms TWOFISH 220ms 210ms 240ms 230ms 230ms 230ms 250ms 250ms 230ms 230ms 470ms 470ms DES 540ms 540ms 580ms 570ms 570ms 570ms 560ms 620ms 580ms 570ms - - CAST5 300ms 290ms 310ms 230ms 320ms 230ms 350ms 350ms 230ms 230ms - - Tests on Intel Atom N160 (i386): Before: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 380ms 380ms 410ms 420ms 400ms 400ms 410ms 410ms 390ms 400ms 820ms 800ms TWOFISH 340ms 340ms 370ms 350ms 360ms 340ms 370ms 370ms 330ms 340ms 710ms 700ms DES 660ms 650ms 710ms 740ms 680ms 700ms 700ms 710ms 680ms 680ms - - CAST5 340ms 340ms 380ms 330ms 360ms 330ms 390ms 390ms 320ms 330ms - - After: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 380ms 380ms 390ms 410ms 400ms 390ms 410ms 400ms 400ms 390ms 810ms 800ms TWOFISH 330ms 340ms 350ms 360ms 350ms 340ms 380ms 370ms 340ms 360ms 700ms 710ms DES 630ms 640ms 660ms 690ms 680ms 680ms 700ms 690ms 680ms 680ms - - CAST5 340ms 330ms 350ms 330ms 370ms 340ms 380ms 390ms 330ms 330ms - - Tests in Intel i5-4570 (x86-64): Before: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 560ms 560ms 600ms 590ms 600ms 570ms 570ms 570ms 580ms 590ms 1200ms 1180ms TWOFISH 240ms 240ms 270ms 160ms 260ms 160ms 250ms 250ms 160ms 160ms 430ms 430ms DES 570ms 570ms 640ms 590ms 630ms 580ms 600ms 600ms 610ms 620ms - - CAST5 410ms 410ms 470ms 150ms 470ms 150ms 450ms 450ms 150ms 160ms - - After: ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- SEED 560ms 560ms 590ms 570ms 580ms 570ms 570ms 570ms 590ms 590ms 1200ms 1200ms TWOFISH 240ms 240ms 260ms 160ms 250ms 170ms 250ms 250ms 160ms 160ms 430ms 430ms DES 570ms 570ms 620ms 580ms 630ms 570ms 600ms 590ms 620ms 620ms - - CAST5 410ms 410ms 460ms 150ms 460ms 160ms 450ms 450ms 150ms 150ms - - Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-10-23bufhelp: enable unaligned memory accesses for AArch64 (64-bit ARM)Jussi Kivilinna1-1/+2
* cipher/bufhelp.h [__aarch64__] (BUFHELP_FAST_UNALIGNED_ACCESS): Set macro on AArch64. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-10-23Enable assembler optimizations on earlier ARM coresDmitry Eremin-Solenikov14-325/+361
* cipher/blowfish-armv6.S => cipher/blowfish-arm.S: adapt to pre-armv6 CPUs. * cipher/blowfish.c: enable assembly on armv4/armv5 little-endian CPUs. * cipher/camellia-armv6.S => cipher/camellia-arm.S: adapt to pre-armv6 CPUs. * cipher/camellia.c, cipher-camellia-glue.c: enable assembly on armv4/armv5 little-endian CPUs. * cipher/cast5-armv6.S => cipher/cast5-arm.S: adapt to pre-armv6 CPUs. * cipher/cast5.c: enable assembly on armv4/armv5 little-endian CPUs. * cipher/rijndael-armv6.S => cipher/rijndael-arm.S: adapt to pre-armv6 CPUs. * cipher/rijndael.c: enable assembly on armv4/armv5 little-endian CPUs. * cipher/twofish-armv6.S => cipher/twofish-arm.S: adapt to pre-armv6 CPUs. * cipher/twofish.c: enable assembly on armv4/armv5 little-endian CPUs. -- Our ARMv6 assembly optimized code can be easily adapted to earlier CPUs. The only incompatible place is rev instruction used to do byte swapping. Replace it on <= ARMv6 with a series of 4 instructions. Compare: ECB/Stream CBC CFB OFB CTR --------------- --------------- --------------- --------------- --------------- AES 620ms 610ms 650ms 680ms 620ms 630ms 660ms 660ms 630ms 630ms CAMELLIA128 720ms 720ms 780ms 790ms 770ms 760ms 780ms 780ms 770ms 760ms CAMELLIA256 910ms 910ms 970ms 970ms 960ms 950ms 970ms 970ms 960ms 950ms CAST5 820ms 820ms 930ms 920ms 890ms 860ms 930ms 920ms 880ms 890ms BLOWFISH 550ms 560ms 650ms 660ms 630ms 600ms 660ms 650ms 610ms 620ms ECB/Stream CBC CFB OFB CTR --------------- --------------- --------------- --------------- --------------- AES 130ms 140ms 180ms 200ms 160ms 170ms 190ms 200ms 170ms 170ms CAMELLIA128 150ms 160ms 210ms 220ms 200ms 190ms 210ms 220ms 190ms 190ms CAMELLIA256 180ms 180ms 260ms 240ms 240ms 230ms 250ms 250ms 230ms 230ms CAST5 170ms 160ms 270ms 120ms 240ms 130ms 260ms 270ms 130ms 120ms BLOWFISH 160ms 150ms 260ms 110ms 230ms 120ms 250ms 260ms 110ms 120ms Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> [ jk: in camellia.h and twofish.c, USE_ARMV6_ASM => USE_ARM_ASM ] [ jk: fix blowfish-arm.S when __ARM_FEATURE_UNALIGNED defined ] [ jk: in twofish.S remove defined(HAVE_ARM_ARCH_V6) ] [ jk: ARMv6 => ARM in comments ]
2013-10-23mpi: enable assembler on all arm architecturesDmitry Eremin-Solenikov7-18/+13
* mpi/config.links: remove check for arm >= v6 * mpi/armv6 => mpi/arm: rename directory to reflect that is is generic enough -- MPI ARM assembly do not depend on CPU being armv6. Verified on PXA255: Before: Algorithm generate 100*sign 100*verify ------------------------------------------------ RSA 1024 bit 3990ms 57980ms 1680ms RSA 2048 bit 59620ms 389430ms 5690ms RSA 3072 bit 535850ms 1223200ms 12000ms RSA 4096 bit 449350ms 2707370ms 20050ms After: Algorithm generate 100*sign 100*verify ------------------------------------------------ RSA 1024 bit 2190ms 13730ms 320ms RSA 2048 bit 12750ms 67640ms 810ms RSA 3072 bit 110520ms 166100ms 1350ms RSA 4096 bit 100870ms 357560ms 2170ms Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> [ jk: ARMv6 => ARM in header comments ]
2013-10-23Correct ASM assembly test in configure.acDmitry Eremin-Solenikov1-3/+2
* configure.ac: correct HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS test to require neither ARMv6, nor thumb mode. Our assembly code works perfectly even on ARMv4 now. Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
2013-10-23ecc: Refactor ecc.cWerner Koch7-1066/+1195
* cipher/ecc-ecdsa.c, cipher/ecc-eddsa.c, cipher/ecc-gost.c: New. * cipher/Makefile.am (EXTRA_libcipher_la_SOURCES): Add new files. * configure.ac (GCRYPT_PUBKEY_CIPHERS): Add new files. * cipher/ecc.c (point_init, point_free): Move to ecc-common.h. (sign_ecdsa): Move to ecc-ecdsa.c as _gcry_ecc_ecdsa_sign. (verify_ecdsa): Move to ecc-ecdsa.c as _gcry_ecc_ecdsa_verify. (sign_gost): Move to ecc-gots.c as _gcry_ecc_gost_sign. (verify_gost): Move to ecc-gost.c as _gcry_ecc_gost_verify. (sign_eddsa): Move to ecc-eddsa.c as _gcry_ecc_eddsa_sign. (verify_eddsa): Move to ecc-eddsa.c as _gcry_ecc_eddsa_verify. (eddsa_generate_key): Move to ecc-eddsa.c as _gcry_ecc_eddsa_genkey. (reverse_buffer): Move to ecc-eddsa.c. (eddsa_encodempi, eddsa_encode_x_y): Ditto. (_gcry_ecc_eddsa_encodepoint, _gcry_ecc_eddsa_decodepoint): Ditto. -- This change should make it easier to add new ECC algorithms. Signed-off-by: Werner Koch <wk@gnupg.org>
2013-10-23mpi: Fix scanning of negative SSH formats and add more tests.Werner Koch4-102/+166
* mpi/mpicoder.c (gcry_mpi_scan): Fix sign setting for SSH format. * tests/t-convert.c (negative_zero): Test all formats. (check_formats): Add tests for PGP and scan tests for SSH and USG. * src/gcrypt.h.in (mpi_is_neg): Fix macro. * mpi/mpi-scan.c (_gcry_mpi_getbyte, _gcry_mpi_putbyte): Comment out these unused functions. Signed-off-by: Werner Koch <wk@gnupg.org>
2013-10-22twofish: add ARMv6 assembly implementationJussi Kivilinna4-27/+432
* cipher/Makefile.am: Add 'twofish-armv6.S'. * cipher/twofish-armv6.S: New. * cipher/twofish.c (USE_ARMV6_ASM): New macro. [USE_ARMV6_ASM] (_gcry_twofish_armv6_encrypt_block) (_gcry_twofish_armv6_decrypt_block): New prototypes. [USE_AMDV6_ASM] (twofish_encrypt, twofish_decrypt): Add. [USE_AMD64_ASM] (do_twofish_encrypt, do_twofish_decrypt): Remove. (_gcry_twofish_ctr_enc, _gcry_twofish_cfb_dec): Use 'twofish_encrypt' instead of 'do_twofish_encrypt'. (_gcry_twofish_cbc_dec): Use 'twofish_decrypt' instead of 'do_twofish_decrypt'. * configure.ac [arm]: Add 'twofish-armv6.lo'. -- Add optimized ARMv6 assembly implementation for Twofish. Implementation is tuned for Cortex-A8. Unaligned access handling is done in assembly part. For now, only enable this on little-endian systems as big-endian correctness have not been tested yet. Old (gcc-4.8) vs new (twofish-asm), Cortex-A8 (on armhf): ECB/Stream CBC CFB OFB CTR CCM --------------- --------------- --------------- --------------- --------------- --------------- TWOFISH 1.23x 1.25x 1.16x 1.26x 1.16x 1.30x 1.18x 1.17x 1.23x 1.23x 1.22x 1.22x Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-10-22mpi: allow building with clang on ARMJussi Kivilinna1-14/+13
* mpi/longlong.h [__arm__] (add_ssaaaa, sub_ddmmss, umul_ppmm) (count_leading_zeros): Do not cast assembly output arguments. [__arm__] (umul_ppmm): Remove the extra '%' ahead of assembly comment. [_ARM_ARCH >= 4] (umul_ppmm): Use correct inputs and outputs instead of registers. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-10-22serpent-amd64: do not use GAS macrosJussi Kivilinna3-593/+440
* cipher/serpent-avx2-amd64.S: Remove use of GAS macros. * cipher/serpent-sse2-amd64.S: Ditto. * configure.ac [HAVE_COMPATIBLE_GCC_AMD64_PLATFORM_AS]: Do not check for GAS macros. -- This way we have better portability; for example, when compiling with clang on x86-64, the assembly implementations are now enabled and working. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-10-22Add Counter with CBC-MAC mode (CCM)Jussi Kivilinna8-25/+1376
* cipher/Makefile.am: Add 'cipher-ccm.c'. * cipher/cipher-ccm.c: New. * cipher/cipher-internal.h (gcry_cipher_handle): Add 'u_mode'. (_gcry_cipher_ccm_encrypt, _gcry_cipher_ccm_decrypt) (_gcry_cipher_ccm_set_nonce, _gcry_cipher_ccm_authenticate) (_gcry_cipher_ccm_get_tag, _gcry_cipher_ccm_check_tag) (_gcry_cipher_ccm_set_lengths): New prototypes. * cipher/cipher.c (gcry_cipher_open, cipher_encrypt, cipher_decrypt) (_gcry_cipher_setiv, _gcry_cipher_authenticate, _gcry_cipher_gettag) (_gcry_cipher_checktag, gry_cipher_ctl): Add handling for CCM mode. * doc/gcrypt.texi: Add documentation for GCRY_CIPHER_MODE_CCM. * src/gcrypt.h.in (gcry_cipher_modes): Add 'GCRY_CIPHER_MODE_CCM'. (gcry_ctl_cmds): Add 'GCRYCTL_SET_CCM_LENGTHS'. (GCRY_CCM_BLOCK_LEN): New. * tests/basic.c (check_ccm_cipher): New. (check_cipher_modes): Call 'check_ccm_cipher'. * tests/benchmark.c (ccm_aead_init): New. (cipher_bench): Add handling for AEAD modes and add CCM benchmarking. -- Patch adds CCM (Counter with CBC-MAC) mode as defined in RFC 3610 and NIST Special Publication 800-38C. Example for encrypting message (split in two buffers; buf1, buf2) and authenticating additional non-encrypted data (split in two buffers; aadbuf1, aadbuf2) with authentication tag length of eigth bytes: size_t params[3]; taglen = 8; gcry_cipher_setkey(h, key, len(key)); gcry_cipher_setiv(h, nonce, len(nonce)); params[0] = len(buf1) + len(buf2); /* 0: enclen */ params[1] = len(aadbuf1) + len(aadbuf2); /* 1: aadlen */ params[2] = taglen; /* 2: authtaglen */ gcry_cipher_ctl(h, GCRYCTL_SET_CCM_LENGTHS, params, sizeof(size_t) * 3); gcry_cipher_authenticate(h, aadbuf1, len(aadbuf1)); gcry_cipher_authenticate(h, aadbuf2, len(aadbuf2)); gcry_cipher_encrypt(h, buf1, len(buf1), buf1, len(buf1)); gcry_cipher_encrypt(h, buf2, len(buf2), buf2, len(buf2)); gcry_cipher_gettag(h, tag, taglen); Example for decrypting above message and checking authentication tag: size_t params[3]; taglen = 8; gcry_cipher_setkey(h, key, len(key)); gcry_cipher_setiv(h, nonce, len(nonce)); params[0] = len(buf1) + len(buf2); /* 0: enclen */ params[1] = len(aadbuf1) + len(aadbuf2); /* 1: aadlen */ params[2] = taglen; /* 2: authtaglen */ gcry_cipher_ctl(h, GCRYCTL_SET_CCM_LENGTHS, params, sizeof(size_t) * 3); gcry_cipher_authenticate(h, aadbuf1, len(aadbuf1)); gcry_cipher_authenticate(h, aadbuf2, len(aadbuf2)); gcry_cipher_decrypt(h, buf1, len(buf1), buf1, len(buf1)); gcry_cipher_decrypt(h, buf2, len(buf2), buf2, len(buf2)); err = gcry_cipher_checktag(h, tag, taglen); if (gpg_err_code (err) == GPG_ERR_CHECKSUM) { /* Authentication failed. */ } else if (err == 0) { /* Authentication ok. */ } Example for encrypting message without additional authenticated data: size_t params[3]; taglen = 10; gcry_cipher_setkey(h, key, len(key)); gcry_cipher_setiv(h, nonce, len(nonce)); params[0] = len(buf1); /* 0: enclen */ params[1] = 0; /* 1: aadlen */ params[2] = taglen; /* 2: authtaglen */ gcry_cipher_ctl(h, GCRYCTL_SET_CCM_LENGTHS, params, sizeof(size_t) * 3); gcry_cipher_encrypt(h, buf1, len(buf1), buf1, len(buf1)); gcry_cipher_gettag(h, tag, taglen); To reset CCM state for cipher handle, one can either set new nonce or use 'gcry_cipher_reset'. This implementation reuses existing CTR mode code for encryption/decryption and is there for able to process multiple buffers that are not multiple of blocksize. AAD data maybe also be passed into gcry_cipher_authenticate in non-blocksize chunks. [v4]: GCRYCTL_SET_CCM_PARAMS => GCRY_SET_CCM_LENGTHS Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-10-22Add API to support AEAD cipher modesJussi Kivilinna7-0/+120
* cipher/cipher.c (_gcry_cipher_authenticate, _gcry_cipher_checktag) (_gcry_cipher_gettag): New. * doc/gcrypt.texi: Add documentation for new API functions. * src/visibility.c (gcry_cipher_authenticate, gcry_cipher_checktag) (gcry_cipher_gettag): New. * src/gcrypt.h.in, src/visibility.h: add declarations of these functions. * src/libgcrypt.defs, src/libgcrypt.vers: export functions. -- Authenticated Encryption with Associated Data (AEAD) cipher modes provide authentication tag that can be used to authenticate message. At the same time it allows one to specify additional (unencrypted data) that will be authenticated together with the message. This class of cipher modes requires additional API present in this commit. This patch is based on original patch by Dmitry Eremin-Solenikov. Changes in v2: - Change gcry_cipher_tag to gcry_cipher_checktag and gcry_cipher_gettag for giving tag (checktag) for decryption and reading tag (gettag) after encryption. - Change gcry_cipher_authenticate to gcry_cipher_setaad, since additional parameters needed for some AEAD modes (in this case CCM, which needs the length of encrypted data and tag for MAC initialization). - Add some documentation. Changes in v3: - Change gcry_cipher_setaad back to gcry_cipher_authenticate. Additional parameters (encrypt_len, tag_len, aad_len) for CCM will be given through GCRY_CTL_SET_CCM_LENGTHS. Changes in v4: - log_fatal => log_error Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-10-22ecc: Correct compliant key generation for Edwards curves.NIIBE Yutaka1-10/+23
* cipher/ecc.c: Add case for Edwards curves. Signed-off-by: Werner Koch <wk@gnupg.org>
2013-10-17tests: Add test options to keygen.Werner Koch1-11/+40
* tests/keygen.c (usage): New. (main): Print usage info. Allow running just one algo. Signed-off-by: Werner Koch <wk@gnupg.org>
2013-10-17mpi: Do not clear the sign of the mpi_mod result.Werner Koch1-1/+0
* mpi/mpi-mod.c (_gcry_mpi_mod): Remove sign setting. Signed-off-by: Werner Koch <wk@gnupg.org>