summaryrefslogtreecommitdiff
path: root/cipher/camellia.c
AgeCommit message (Collapse)AuthorFilesLines
2013-11-06Fix 'u32' build error with CamelliaJussi Kivilinna1-3/+3
* cipher/camellia.c: Add include for <config.h> and "types.h". (u32): Remove. (u8): Typedef as 'byte'. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-10-23Enable assembler optimizations on earlier ARM coresDmitry Eremin-Solenikov1-4/+4
* cipher/blowfish-armv6.S => cipher/blowfish-arm.S: adapt to pre-armv6 CPUs. * cipher/blowfish.c: enable assembly on armv4/armv5 little-endian CPUs. * cipher/camellia-armv6.S => cipher/camellia-arm.S: adapt to pre-armv6 CPUs. * cipher/camellia.c, cipher-camellia-glue.c: enable assembly on armv4/armv5 little-endian CPUs. * cipher/cast5-armv6.S => cipher/cast5-arm.S: adapt to pre-armv6 CPUs. * cipher/cast5.c: enable assembly on armv4/armv5 little-endian CPUs. * cipher/rijndael-armv6.S => cipher/rijndael-arm.S: adapt to pre-armv6 CPUs. * cipher/rijndael.c: enable assembly on armv4/armv5 little-endian CPUs. * cipher/twofish-armv6.S => cipher/twofish-arm.S: adapt to pre-armv6 CPUs. * cipher/twofish.c: enable assembly on armv4/armv5 little-endian CPUs. -- Our ARMv6 assembly optimized code can be easily adapted to earlier CPUs. The only incompatible place is rev instruction used to do byte swapping. Replace it on <= ARMv6 with a series of 4 instructions. Compare: ECB/Stream CBC CFB OFB CTR --------------- --------------- --------------- --------------- --------------- AES 620ms 610ms 650ms 680ms 620ms 630ms 660ms 660ms 630ms 630ms CAMELLIA128 720ms 720ms 780ms 790ms 770ms 760ms 780ms 780ms 770ms 760ms CAMELLIA256 910ms 910ms 970ms 970ms 960ms 950ms 970ms 970ms 960ms 950ms CAST5 820ms 820ms 930ms 920ms 890ms 860ms 930ms 920ms 880ms 890ms BLOWFISH 550ms 560ms 650ms 660ms 630ms 600ms 660ms 650ms 610ms 620ms ECB/Stream CBC CFB OFB CTR --------------- --------------- --------------- --------------- --------------- AES 130ms 140ms 180ms 200ms 160ms 170ms 190ms 200ms 170ms 170ms CAMELLIA128 150ms 160ms 210ms 220ms 200ms 190ms 210ms 220ms 190ms 190ms CAMELLIA256 180ms 180ms 260ms 240ms 240ms 230ms 250ms 250ms 230ms 230ms CAST5 170ms 160ms 270ms 120ms 240ms 130ms 260ms 270ms 130ms 120ms BLOWFISH 160ms 150ms 260ms 110ms 230ms 120ms 250ms 260ms 110ms 120ms Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> [ jk: in camellia.h and twofish.c, USE_ARMV6_ASM => USE_ARM_ASM ] [ jk: fix blowfish-arm.S when __ARM_FEATURE_UNALIGNED defined ] [ jk: in twofish.S remove defined(HAVE_ARM_ARCH_V6) ] [ jk: ARMv6 => ARM in comments ]
2013-09-21Optimize and cleanup 32-bit and 64-bit endianess transformsJussi Kivilinna1-11/+3
* cipher/bithelp.h (bswap32, bswap64, le_bswap32, be_bswap32) (le_bswap64, be_bswap64): New. * cipher/bufhelp.h (buf_get_be32, buf_get_le32, buf_put_le32) (buf_put_be32, buf_get_be64, buf_get_le64, buf_put_be64) (buf_put_le64): New. * cipher/blowfish.c (do_encrypt_block, do_decrypt_block): Use new endian conversion helpers. (do_bf_setkey): Turn endian specific code to generic. * cipher/camellia.c (GETU32, PUTU32): Use new endian conversion helpers. * cipher/cast5.c (rol): Remove, use rol from bithelp. (F1, F2, F3): Fix to use rol from bithelp. (do_encrypt_block, do_decrypt_block, do_cast_setkey): Use new endian conversion helpers. * cipher/des.c (READ_64BIT_DATA, WRITE_64BIT_DATA): Ditto. * cipher/md4.c (transform, md4_final): Ditto. * cipher/md5.c (transform, md5_final): Ditto. * cipher/rmd160.c (transform, rmd160_final): Ditto. * cipher/salsa20.c (LE_SWAP32, LE_READ_UINT32): Ditto. * cipher/scrypt.c (READ_UINT64, LE_READ_UINT64, LE_SWAP32): Ditto. * cipher/seed.c (GETU32, PUTU32): Ditto. * cipher/serpent.c (byte_swap_32): Remove. (serpent_key_prepare, serpent_encrypt_internal) (serpent_decrypt_internal): Use new endian conversion helpers. * cipher/sha1.c (transform, sha1_final): Ditto. * cipher/sha256.c (transform, sha256_final): Ditto. * cipher/sha512.c (__transform, sha512_final): Ditto. * cipher/stribog.c (transform, stribog_final): Ditto. * cipher/tiger.c (transform, tiger_final): Ditto. * cipher/twofish.c (INPACK, OUTUNPACK): Ditto. * cipher/whirlpool.c (buffer_to_block, block_to_buffer): Ditto. * configure.ac (gcry_cv_have_builtin_bswap32): Check for compiler provided __builtin_bswap32. (gcry_cv_have_builtin_bswap64): Check for compiler provided __builtin_bswap64. -- Patch add helper functions that provide conversions to/from integers and buffers of different endianess. Benefits are code cleanup and optimization for architectures that have byte-swaping instructions and/or can do fast unaligned memory accesses. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-08-16camellia: add ARMv6 assembly implementationJussi Kivilinna1-0/+5
* cipher/Makefile.am: Add 'camellia-armv6.S'. * cipher/camellia-armv6.S: New file. * cipher/camellia-glue.c [USE_ARMV6_ASM] (_gcry_camellia_armv6_encrypt_block) (_gcry_camellia_armv6_decrypt_block): New prototypes. [USE_ARMV6_ASM] (Camellia_EncryptBlock, Camellia_DecryptBlock) (camellia_encrypt, camellia_decrypt): New functions. * cipher/camellia.c [!USE_ARMV6_ASM]: Compile encryption and decryption routines if USE_ARMV6_ASM macro is _not_ defined. * cipher/camellia.h (USE_ARMV6_ASM): New macro. [!USE_ARMV6_ASM] (Camellia_EncryptBlock, Camellia_DecryptBlock): If USE_ARMV6_ASM is defined, disable these function prototypes. (camellia) [arm]: Add 'camellia-armv6.lo'. -- Add optimized ARMv6 assembly implementation for Camellia. Implementation is tuned for Cortex-A8. Unaligned access handling is done in assembly part. For now. only enable this on little-endian systems as big-endian correctness have not been tested yet. Old vs new. Cortex-A8 (on Debian Wheezy/armhf): ECB/Stream CBC CFB OFB CTR --------------- --------------- --------------- --------------- --------------- CAMELLIA128 1.44x 1.47x 1.35x 1.34x 1.43x 1.39x 1.38x 1.36x 1.38x 1.39x CAMELLIA192 1.60x 1.62x 1.52x 1.47x 1.56x 1.54x 1.52x 1.53x 1.52x 1.53x CAMELLIA256 1.59x 1.60x 1.49x 1.47x 1.53x 1.54x 1.51x 1.50x 1.52x 1.53x Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
2013-02-19Add AES-NI/AVX accelerated Camellia implementationJussi Kivilinna1-3/+2
* configure.ac: Add option --disable-avx-support. (HAVE_GCC_INLINE_ASM_AVX): New. (ENABLE_AVX_SUPPORT): New. (camellia) [ENABLE_AVX_SUPPORT, ENABLE_AESNI_SUPPORT]: Add camellia_aesni_avx_x86-64.lo. * cipher/Makefile.am (AM_CCASFLAGS): Add. (EXTRA_libcipher_la_SOURCES): Add camellia_aesni_avx_x86-64.S * cipher/camellia-glue.c [ENABLE_AESNI_SUPPORT, ENABLE_AVX_SUPPORT] [__x86_64__] (USE_AESNI_AVX): Add macro. (struct Camellia_context) [USE_AESNI_AVX]: Add use_aesni_avx. [USE_AESNI_AVX] (_gcry_camellia_aesni_avx_ctr_enc) (_gcry_camellia_aesni_avx_cbc_dec): New prototypes to assembly functions. (camellia_setkey) [USE_AESNI_AVX]: Enable AES-NI/AVX if hardware support both. (_gcry_camellia_ctr_enc) [USE_AESNI_AVX]: Add AES-NI/AVX code. (_gcry_camellia_cbc_dec) [USE_AESNI_AVX]: Add AES-NI/AVX code. * cipher/camellia_aesni_avx_x86-64.S: New. * src/g10lib.h (HWF_INTEL_AVX): New. * src/global.c (hwflist): Add HWF_INTEL_AVX. * src/hwf-x86.c (detect_x86_gnuc) [ENABLE_AVX_SUPPORT]: Add detection for AVX. -- Before: Running each test 250 times. ECB/Stream CBC CFB OFB CTR --------------- --------------- --------------- --------------- --------------- CAMELLIA128 2210ms 2200ms 2300ms 2050ms 2240ms 2250ms 2290ms 2270ms 2070ms 2070ms CAMELLIA256 2810ms 2800ms 2920ms 2670ms 2840ms 2850ms 2910ms 2890ms 2660ms 2640ms After: Running each test 250 times. ECB/Stream CBC CFB OFB CTR --------------- --------------- --------------- --------------- --------------- CAMELLIA128 2200ms 2220ms 2290ms 470ms 2240ms 2270ms 2270ms 2290ms 480ms 480ms CAMELLIA256 2820ms 2820ms 2900ms 600ms 2860ms 2860ms 2900ms 2920ms 620ms 620ms AES-NI/AVX implementation works by processing 16 parallel blocks (256 bytes). It's bytesliced implementation that uses AES-NI (Subbyte) for Camellia sboxes, with help of prefiltering/postfiltering. For smaller data sets generic C implementation is used. Speed-up for CBC-decryption and CTR-mode (large data): 4.3x Tests were run on: Intel Core i5-2450M Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> (license boiler plate update by wk)
2013-02-19camellia.c: Prepare for AES-NI/AVX implementationJussi Kivilinna1-94/+50
* cipher/camellia-glue.c (CAMELLIA_encrypt_stack_burn_size) (CAMELLIA_decrypt_stack_burn_size): Increase stack burn size. * cipher/camellia.c (CAMELLIA_ROUNDSM): Move key-material mixing in the front. (camellia_setup128, camellia_setup256): Remove now unneeded key-material mangling. (camellia_encrypt128, camellia_decrypt128, amellia_encrypt256) (camellia_decrypt256): Copy block to stack, so that compiler can optimize it for register usage. -- Camellia implementation needs to be modified slightly for compatibility with AES-NI/AVX version. Before: Running each test 100 times. ECB/Stream CBC CFB OFB CTR --------------- --------------- --------------- --------------- --------------- CAMELLIA128 800ms 790ms 840ms 730ms 810ms 800ms 820ms 820ms 730ms 740ms CAMELLIA192 1040ms 1040ms 1030ms 930ms 1000ms 1000ms 1020ms 1020ms 940ms 930ms CAMELLIA256 1000ms 980ms 1040ms 930ms 1010ms 990ms 1040ms 1040ms 940ms 930ms After: Running each test 100 times. ECB/Stream CBC CFB OFB CTR --------------- --------------- --------------- --------------- --------------- CAMELLIA128 780ms 750ms 810ms 690ms 780ms 770ms 810ms 810ms 700ms 690ms CAMELLIA192 1020ms 990ms 1000ms 890ms 970ms 970ms 990ms 1000ms 890ms 900ms CAMELLIA256 960ms 960ms 1000ms 900ms 970ms 970ms 990ms 1010ms 900ms 890ms Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
2011-02-04Nuked almost all trailing whitespace.Werner Koch1-13/+13
Check and install the standard git pre-commit hook.
2007-05-31Camellia is now LGPLedWerner Koch1-10/+10
2007-05-02New files for Camellia.David Shaw1-0/+1461