diff options
author | Jussi Kivilinna <jussi.kivilinna@iki.fi> | 2015-08-10 22:09:56 +0300 |
---|---|---|
committer | Jussi Kivilinna <jussi.kivilinna@iki.fi> | 2015-08-10 22:09:56 +0300 |
commit | 49f52c67fb42c0656c8f9af655087f444562ca82 (patch) | |
tree | 2ef935a60649db8d61b3e1f36982788a15a10506 /cipher/twofish.c | |
parent | ce746936b6c210e602d106cfbf45cf60b408d871 (diff) | |
download | libgcrypt-49f52c67fb42c0656c8f9af655087f444562ca82.tar.gz |
Optimize OCB offset calculation
* cipher/cipher-internal.h (ocb_get_l): New.
* cipher/cipher-ocb.c (_gcry_cipher_ocb_authenticate)
(ocb_crypt): Use 'ocb_get_l' instead of '_gcry_cipher_ocb_get_l'.
* cipher/camellia-glue.c (get_l): Remove.
(_gcry_camellia_ocb_crypt, _gcry_camellia_ocb_auth): Precalculate
offset array when block count matches parallel operation size; Use
'ocb_get_l' instead of 'get_l'.
* cipher/rijndael-aesni.c (get_l): Add fast path for 75% most common
offsets.
(aesni_ocb_enc, aesni_ocb_dec, _gcry_aes_aesni_ocb_auth): Precalculate
offset array when block count matches parallel operation size.
* cipher/rijndael-ssse3-amd64.c (get_l): Add fast path for 75% most
common offsets.
* cipher/rijndael.c (_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth): Use
'ocb_get_l' instead of '_gcry_cipher_ocb_get_l'.
* cipher/serpent.c (get_l): Remove.
(_gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth): Precalculate
offset array when block count matches parallel operation size; Use
'ocb_get_l' instead of 'get_l'.
* cipher/twofish.c (get_l): Remove.
(_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): Use 'ocb_get_l'
instead of 'get_l'.
--
Patch optimizes OCB offset calculation for generic code and
assembly implementations with parallel block processing.
Benchmark of OCB AES-NI on Intel Haswell:
$ tests/bench-slope --cpu-mhz 3201 cipher aes
Before:
AES | nanosecs/byte mebibytes/sec cycles/byte
CTR enc | 0.274 ns/B 3483.9 MiB/s 0.876 c/B
CTR dec | 0.273 ns/B 3490.0 MiB/s 0.875 c/B
OCB enc | 0.289 ns/B 3296.1 MiB/s 0.926 c/B
OCB dec | 0.299 ns/B 3189.9 MiB/s 0.957 c/B
OCB auth | 0.260 ns/B 3670.0 MiB/s 0.832 c/B
After:
AES | nanosecs/byte mebibytes/sec cycles/byte
CTR enc | 0.273 ns/B 3489.4 MiB/s 0.875 c/B
CTR dec | 0.273 ns/B 3487.5 MiB/s 0.875 c/B
OCB enc | 0.248 ns/B 3852.8 MiB/s 0.792 c/B
OCB dec | 0.261 ns/B 3659.5 MiB/s 0.834 c/B
OCB auth | 0.227 ns/B 4205.5 MiB/s 0.726 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Diffstat (limited to 'cipher/twofish.c')
-rw-r--r-- | cipher/twofish.c | 25 |
1 files changed, 6 insertions, 19 deletions
diff --git a/cipher/twofish.c b/cipher/twofish.c index 11e60a74..7f361c99 100644 --- a/cipher/twofish.c +++ b/cipher/twofish.c @@ -1247,19 +1247,6 @@ _gcry_twofish_cfb_dec(void *context, unsigned char *iv, void *outbuf_arg, _gcry_burn_stack(burn_stack_depth); } -#ifdef USE_AMD64_ASM -static inline const unsigned char * -get_l (gcry_cipher_hd_t c, unsigned char *l_tmp, u64 i) -{ - unsigned int ntz = _gcry_ctz64 (i); - - if (ntz < OCB_L_TABLE_SIZE) - return c->u_mode.ocb.L[ntz]; - else - return _gcry_cipher_ocb_get_l (c, l_tmp, i); -} -#endif - /* Bulk encryption/decryption of complete blocks in OCB mode. */ size_t _gcry_twofish_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, @@ -1280,9 +1267,9 @@ _gcry_twofish_ocb_crypt (gcry_cipher_hd_t c, void *outbuf_arg, while (nblocks >= 3) { /* l_tmp will be used only every 65536-th block. */ - Ls[0] = get_l(c, l_tmp, blkn + 1); - Ls[1] = get_l(c, l_tmp, blkn + 2); - Ls[2] = get_l(c, l_tmp, blkn + 3); + Ls[0] = ocb_get_l(c, l_tmp, blkn + 1); + Ls[1] = ocb_get_l(c, l_tmp, blkn + 2); + Ls[2] = ocb_get_l(c, l_tmp, blkn + 3); blkn += 3; if (encrypt) @@ -1339,9 +1326,9 @@ _gcry_twofish_ocb_auth (gcry_cipher_hd_t c, const void *abuf_arg, while (nblocks >= 3) { /* l_tmp will be used only every 65536-th block. */ - Ls[0] = get_l(c, l_tmp, blkn + 1); - Ls[1] = get_l(c, l_tmp, blkn + 2); - Ls[2] = get_l(c, l_tmp, blkn + 3); + Ls[0] = ocb_get_l(c, l_tmp, blkn + 1); + Ls[1] = ocb_get_l(c, l_tmp, blkn + 2); + Ls[2] = ocb_get_l(c, l_tmp, blkn + 3); blkn += 3; twofish_amd64_ocb_auth(ctx, abuf, c->u_mode.ocb.aad_offset, |