Age | Commit message (Collapse) | Author | Files | Lines |
|
* cipher/ecc.c (ecc_decrypt_raw): Curve25519 is an exception.
--
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
* cipher/ecc.c (ecc_decrypt_raw): Go to leave to release memory.
* mpi/ec.c (_gcry_mpi_ec_curve_point): Likewise.
--
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
* cipher/ecc.c (ecc_decrypt_raw): Validate the point.
--
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
(forward port from LIBGCRYPT-1-6-BRANCH
commit 28eb424e4427b320ec1c9c4ce56af25d495230bd)
|
|
* cipher/Makefile.am: Add 'sha512-arm.S'.
* cipher/sha512-arm.S: New.
* cipher/sha512.c (USE_ARM_ASM): New.
(_gcry_sha512_transform_arm): New.
(transform) [USE_ARM_ASM]: Use ARM assembly implementation instead of
generic.
* configure.ac: Add 'sha512-arm.lo'.
--
Benchmark on Cortex-A8 (armv6, 1008 Mhz):
Before:
| nanosecs/byte mebibytes/sec cycles/byte
SHA512 | 112.0 ns/B 8.52 MiB/s 112.9 c/B
After (3.3x faster):
| nanosecs/byte mebibytes/sec cycles/byte
SHA512 | 34.01 ns/B 28.04 MiB/s 34.28 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/ecc-misc.c (gcry_ecc_mont_decodepoint): Fix code path for
short length data.
--
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
* cipher/ecc-misc.c (gcry_ecc_mont_decodepoint): Fix removing of
prefix. Clear the MSB, according to RFC7748.
--
This change fixes two things.
* Handle the case the prefix 0x40 comes at the end when scanned as
standard MPI.
* Implement MSB handling. In the page 7 of RFC7748, it says about
decoding u-coordinate:
When receiving such an array, implementations of X25519 (but not
X448) MUST mask the most significant bit in the final byte.
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
* cipher/ecc-misc.c (_gcry_ecc_mont_decodepoint): Fix calc of NBITS
and prefix detection.
* cipher/ecc.c (ecc_generate): Use NBITS instead of CTX->NBITS.
(ecc_encrypt_raw): Use NBITS from curve instead of from P.
Fix rawmpilen calculation.
(ecc_decrypt_raw): Likewise. Add debug output.
--
This fixes the commit dd3d06e7. NBITS is defined 256 in ecc-curves.c,
thus, ecc_get_nbits returns 256. But CTX->NBITS has 255 for Montgomery
curve.
|
|
* cipher/sha256.c (R): Let caller do variable shuffling.
(Chro, Maj, Sum0, Sum1): Convert from inline functions to macros.
(W, I): New.
(transform_blk): Unroll round loop; inline message expansion to rounds
to make message expansion buffer smaller.
--
Benchmark on Cortex-A8 (armv6, 1008 Mhz):
Before:
| nanosecs/byte mebibytes/sec cycles/byte
SHA256 | 27.63 ns/B 34.52 MiB/s 27.85 c/B
After (1.31x faster):
| nanosecs/byte mebibytes/sec cycles/byte
SHA256 | 20.97 ns/B 45.48 MiB/s 21.13 c/B
Benchmark on Cortex-A8 (armv7, 1008 Mhz):
Before:
| nanosecs/byte mebibytes/sec cycles/byte
SHA256 | 24.18 ns/B 39.43 MiB/s 24.38 c/B
After (1.13x faster):
| nanosecs/byte mebibytes/sec cycles/byte
SHA256 | 21.28 ns/B 44.82 MiB/s 21.45 c/B
Benchmark on Intel Core i5-4570 (i386, 3.2 Ghz):
Before:
| nanosecs/byte mebibytes/sec cycles/byte
SHA256 | 5.78 ns/B 164.9 MiB/s 18.51 c/B
After (1.06x faster)
| nanosecs/byte mebibytes/sec cycles/byte
SHA256 | 5.41 ns/B 176.1 MiB/s 17.33 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* mpi/ec.c (_gcry_mpi_ec_decode_point): New.
* cipher/ecc-common.h: Move two prototypes to ...
* src/ec-context.h: here.
* src/gcrypt.h.in (gcry_mpi_ec_decode_point): New.
* src/libgcrypt.def (gcry_mpi_ec_decode_point): New.
* src/libgcrypt.vers (gcry_mpi_ec_decode_point): New.
* src/visibility.c (gcry_mpi_ec_decode_point): New.
* src/visibility.h: Add new function.
--
This new function make the use of the gcry_mpi_ec_curve_point function
possible in many contexts. Here is a code snippet which could be used
in gpg to check a point:
static gpg_error_t
check_point (PKT_public_key *pk, gcry_mpi_t m_point)
{
gpg_error_t err;
char *curve;
gcry_ctx_t gctx = NULL;
gcry_mpi_point_t point = NULL;
/* Get the curve name from the first OpenPGP key parameter. */
curve = openpgp_oid_to_str (pk->pkey[0]);
if (!curve)
{
err = gpg_error_from_syserror ();
goto leave;
}
point = gcry_mpi_point_new (0);
if (!point)
{
err = gpg_error_from_syserror ();
goto leave;
}
err = gcry_mpi_ec_new (&gctx, NULL, curve);
if (err)
goto leave;
err = gcry_mpi_ec_decode_point (point, m_point, gctx);
if (err)
goto leave;
if (!gcry_mpi_ec_curve_point (point, gctx))
err = gpg_error (GPG_ERR_BAD_DATA);
leave:
gcry_ctx_release (gctx);
gcry_mpi_point_release (point);
xfree (curve);
return err;
}
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
* cipher/ecc.c (ecc_decrypt_raw): Improve error handling.
--
Found using the Clang Static Analyzer.
Signed-off-by: Justus Winter <justus@g10code.com>
|
|
* cipher/ecc.c (ecc_encrypt_raw): Initialize 'flags' to 0.
--
Found using the Clang Static Analyzer.
Signed-off-by: Justus Winter <justus@g10code.com>
|
|
* cipher/ecc-misc.c (_gcry_ecc_mont_decodepoint): Decode point with
the prefix 0x40, additional 0x00 by MPI handling, and shorter octets
by MPI normalization.
* cipher/ecc.c (ecc_generate, ecc_encrypt_raw, ecc_decrypt_raw):
Always add the prefix 0x40.
--
Curve25519 native little-endian point representation is not friendly
to existing practice of OpenPGP code, where MPI is assumed. MPI
handling might insert 0x00 in the beginning to avoid sign confusion.
MPI handling also might remove 0x00s in the front. So, it is safe
to put the prefix 0x40.
While we support old point representation of no prefix in
ecc_mont_decodepoint, new libgcrypt always put the prefix.
|
|
* cipher/chacha20.c (selftest): Ensure 16-byte alignment for chacha20
context structure.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/salsa20.c (selftest): Ensure 16-byte alignment for salsa20
context structure.
--
Reported-by: Carlos J Puga Medina <cpm@fbsd.es>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/cipher.c (_gcry_cipher_ctl): Fix error handling.
--
Found using the Clang Static Analyzer.
Signed-off-by: Justus Winter <justus@g10code.com>
|
|
* cipher/keccak_permute_32.h (KECCAK_F1600_PERMUTE_FUNC_NAME): Track
rounds with round constant pointer instead of separate round counter.
* cipher/keccak_permute_64.h (KECCAK_F1600_PERMUTE_FUNC_NAME): Ditto.
(KECCAK_F1600_ABSORB_FUNC_NAME): Tweak lanes pointer increment for bulk
absorb loops.
--
Patch makes small tweaks to improve performance.
Benchmark on Intel Haswell @ 3.2 Ghz:
Before:
| nanosecs/byte mebibytes/sec cycles/byte
SHAKE128 | 2.27 ns/B 420.5 MiB/s 7.26 c/B
SHAKE256 | 2.79 ns/B 341.4 MiB/s 8.94 c/B
SHA3-224 | 2.64 ns/B 361.7 MiB/s 8.44 c/B
SHA3-256 | 2.79 ns/B 341.4 MiB/s 8.94 c/B
SHA3-384 | 3.65 ns/B 261.3 MiB/s 11.68 c/B
SHA3-512 | 5.27 ns/B 181.0 MiB/s 16.86 c/B
After:
| nanosecs/byte mebibytes/sec cycles/byte
SHAKE128 | 2.25 ns/B 423.5 MiB/s 7.21 c/B
SHAKE256 | 2.77 ns/B 343.9 MiB/s 8.88 c/B
SHA3-224 | 2.62 ns/B 364.1 MiB/s 8.38 c/B
SHA3-256 | 2.77 ns/B 343.8 MiB/s 8.88 c/B
SHA3-384 | 3.63 ns/B 262.6 MiB/s 11.63 c/B
SHA3-512 | 5.23 ns/B 182.3 MiB/s 16.75 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/cipher-ocb.c: Fix typos.
* cipher/des.c: Likewise.
* cipher/dsa-common.c: Likewise.
* cipher/ecc.c: Likewise.
* cipher/pubkey.c: Likewise.
* cipher/rsa-common.c: Likewise.
* cipher/scrypt.c: Likewise.
* random/random-csprng.c: Likewise.
* random/random-fips.c: Likewise.
* random/rndw32.c: Likewise.
* src/cipher-proto.h: Likewise.
* src/context.c: Likewise.
* src/fips.c: Likewise.
* src/gcrypt.h.in: Likewise.
* src/global.c: Likewise.
* src/sexp.c: Likewise.
* tests/mpitests.c: Likewise.
* tests/t-lock.c: Likewise.
Signed-off-by: Justus Winter <justus@g10code.com>
|
|
* cipher/tiger.c (tiger_round, pass, key_schedule): Convert functions
to macros.
(transform_blk): Pass variable names instead of pointers to 'pass'.
--
Benchmark results on Intel Haswell @ 3.2 Ghz:
Before:
| nanosecs/byte mebibytes/sec cycles/byte
TIGER | 3.25 ns/B 293.5 MiB/s 10.40 c/B
After (1.75x faster):
| nanosecs/byte mebibytes/sec cycles/byte
TIGER | 1.85 ns/B 515.3 MiB/s 5.92 c/B
Benchmark results on Cortex-A8 @ 1008 Mhz:
Before:
| nanosecs/byte mebibytes/sec cycles/byte
TIGER | 63.42 ns/B 15.04 MiB/s 63.93 c/B
After (1.26x faster):
| nanosecs/byte mebibytes/sec cycles/byte
TIGER | 49.99 ns/B 19.08 MiB/s 50.39 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/Makefile.am: Add 'keccak-armv7-neon.S'.
* cipher/keccak-armv7-neon.S: New.
* cipher/keccak.c (USE_64BIT_ARM_NEON): New.
(NEED_COMMON64): Select if USE_64BIT_ARM_NEON.
[NEED_COMMON64] (round_consts_64bit): Rename to...
[NEED_COMMON64] (_gcry_keccak_round_consts_64bit): ...this; Add
terminator at end.
[USE_64BIT_ARM_NEON] (_gcry_keccak_permute_armv7_neon)
(_gcry_keccak_absorb_lanes64_armv7_neon, keccak_permute64_armv7_neon)
(keccak_absorb_lanes64_armv7_neon, keccak_armv7_neon_64_ops): New.
(keccak_init) [USE_64BIT_ARM_NEON]: Select ARM/NEON implementation
if supported by HW.
* cipher/keccak_permute_64.h (KECCAK_F1600_PERMUTE_FUNC_NAME): Update
to use new round constant table.
* configure.ac: Add 'keccak-armv7-neon.lo'.
--
Patch adds ARMv7/NEON implementation of Keccak (SHAKE/SHA3). Patch
is based on public-domain implementation by Ronny Van Keer from
SUPERCOP package:
https://github.com/floodyberry/supercop/blob/master/crypto_hash/\
keccakc1024/inplace-armv7a-neon/keccak2.s
Benchmark results on Cortex-A8 @ 1008 Mhz:
Before (generic 32-bit bit-interleaved impl.):
| nanosecs/byte mebibytes/sec cycles/byte
SHAKE128 | 83.00 ns/B 11.49 MiB/s 83.67 c/B
SHAKE256 | 101.7 ns/B 9.38 MiB/s 102.5 c/B
SHA3-224 | 96.13 ns/B 9.92 MiB/s 96.90 c/B
SHA3-256 | 101.5 ns/B 9.40 MiB/s 102.3 c/B
SHA3-384 | 131.4 ns/B 7.26 MiB/s 132.5 c/B
SHA3-512 | 189.1 ns/B 5.04 MiB/s 190.6 c/B
After (ARM/NEON, ~3.2x faster):
| nanosecs/byte mebibytes/sec cycles/byte
SHAKE128 | 25.09 ns/B 38.01 MiB/s 25.29 c/B
SHAKE256 | 30.95 ns/B 30.82 MiB/s 31.19 c/B
SHA3-224 | 29.24 ns/B 32.61 MiB/s 29.48 c/B
SHA3-256 | 30.95 ns/B 30.82 MiB/s 31.19 c/B
SHA3-384 | 40.42 ns/B 23.59 MiB/s 40.74 c/B
SHA3-512 | 58.37 ns/B 16.34 MiB/s 58.84 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/keccak.c [USE_64BIT] [__x86_64__] (absorb_lanes64_8)
(absorb_lanes64_4, absorb_lanes64_2, absorb_lanes64_1): New.
* cipher/keccak.c [USE_64BIT] [!__x86_64__] (absorb_lanes64_8)
(absorb_lanes64_4, absorb_lanes64_2, absorb_lanes64_1): New.
[USE_64BIT] (KECCAK_F1600_ABSORB_FUNC_NAME): New.
[USE_64BIT] (keccak_absorb_lanes64): Remove.
[USE_64BIT_SHLD] (KECCAK_F1600_ABSORB_FUNC_NAME): New.
[USE_64BIT_SHLD] (keccak_absorb_lanes64_shld): Remove.
[USE_64BIT_BMI2] (KECCAK_F1600_ABSORB_FUNC_NAME): New.
[USE_64BIT_BMI2] (keccak_absorb_lanes64_bmi2): Remove.
* cipher/keccak_permute_64.h (KECCAK_F1600_ABSORB_FUNC_NAME): New.
--
Optimize 64-bit absorb functions for small speed-up. After this
change, 64-bit BMI2 implementation matches speed of fastest results
from SUPERCOP for Intel Haswell CPUs (long messages).
Benchmark on Intel Haswell @ 3.2 Ghz:
Before:
| nanosecs/byte mebibytes/sec cycles/byte
SHAKE128 | 2.32 ns/B 411.7 MiB/s 7.41 c/B
SHAKE256 | 2.84 ns/B 336.2 MiB/s 9.08 c/B
SHA3-224 | 2.69 ns/B 354.9 MiB/s 8.60 c/B
SHA3-256 | 2.84 ns/B 336.0 MiB/s 9.08 c/B
SHA3-384 | 3.69 ns/B 258.4 MiB/s 11.81 c/B
SHA3-512 | 5.30 ns/B 179.9 MiB/s 16.97 c/B
After:
| nanosecs/byte mebibytes/sec cycles/byte
SHAKE128 | 2.27 ns/B 420.6 MiB/s 7.26 c/B
SHAKE256 | 2.79 ns/B 341.4 MiB/s 8.94 c/B
SHA3-224 | 2.64 ns/B 361.7 MiB/s 8.44 c/B
SHA3-256 | 2.79 ns/B 341.5 MiB/s 8.94 c/B
SHA3-384 | 3.65 ns/B 261.4 MiB/s 11.68 c/B
SHA3-512 | 5.27 ns/B 181.0 MiB/s 16.87 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* src/hash-common.c (_gcry_hash_selftest_check_one): Add handling for
XOFs.
* src/keccak.c (keccak_ops_t): Rename 'extract_inplace' to 'extract'
and add 'pos' argument.
(KECCAK_CONTEXT): Add 'suffix'.
(keccak_extract_inplace64): Rename to...
(keccak_extract64): ...this; Add handling for 'pos' argument.
(keccak_extract_inplace32bi): Rename to...
(keccak_extract32bi): ...this; Add handling for 'pos' argument.
(keccak_extract_inplace64): Rename to...
(keccak_extract64): ...this; Add handling for 'pos' argument.
(keccak_extract_inplace32bi_bmi2): Rename to...
(keccak_extract32bi_bmi2): ...this; Add handling for 'pos' argument.
(keccak_init): Setup 'suffix'; add SHAKE128 & SHAKE256.
(shake128_init, shake256_init): New.
(keccak_final): Do not initial permute for SHAKE output; use correct
suffix for SHAKE.
(keccak_extract): New.
(keccak_selftests_keccak): Add SHAKE128 & SHAKE256 test-vectors.
(run_selftests): Add SHAKE128 & SHAKE256.
(shake128_asn, oid_spec_shake128, shake256_asn, oid_spec_shake256)
(_gcry_digest_spec_shake128, _gcry_digest_spec_shake256): New.
* cipher/md.c (digest_list): Add SHAKE128 & SHAKE256.
* doc/gcrypt.texi: Ditto.
* src/cipher.h (_gcry_digest_spec_shake128)
(_gcry_digest_spec_shake256): New.
* src/gcrypt.h.in (GCRY_MD_SHAKE128, GCRY_MD_SHAKE256): New.
* tests/basic.c (check_one_md): Add XOF check; Add 'elen' argument.
(check_one_md_multi): Skip if algo is XOF.
(check_digests): Add SHAKE128 & SHAKE256 test vectors.
* tests/bench-slope.c (kdf_bench_one): Skip XOFs.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/crc.c (_gcry_digest_spec_crc32)
(_gcry_digest_spec_crc32_rfc1510, _gcry_digest_spec_crc24_rfc2440): Set
'extract' NULL.
* cipher/gostr3411-94.c (_gcry_digest_spec_gost3411_94)
(_gcry_digest_spec_gost3411_cp): Ditto.
* cipher/keccak.c (_gcry_digest_spec_sha3_224)
(_gcry_digest_spec_sha3_256, _gcry_digest_spec_sha3_384)
(_gcry_digest_spec_sha3_512): Ditto.
* cipher/md2.c (_gcry_digest_spec_md2): Ditto.
* cipher/md4.c (_gcry_digest_spec_md4): Ditto.
* cipher/md5.c (_gcry_digest_spec_md5): Ditto.
* cipher/rmd160.c (_gcry_digest_spec_rmd160): Ditto.
* cipher/sha1.c (_gcry_digest_spec_sha1): Ditto.
* cipher/sha256.c (_gcry_digest_spec_sha224)
(_gcry_digest_spec_sha256): Ditto.
* cipher/sha512.c (_gcry_digest_spec_sha384)
(_gcry_digest_spec_sha512): Ditto.
* cipher/stribog.c (_gcry_digest_spec_stribog_256)
(_gcry_digest_spec_stribog_512): Ditto.
* cipher/tiger.c (_gcry_digest_spec_tiger)
(_gcry_digest_spec_tiger1, _gcry_digest_spec_tiger2): Ditto.
* cipher/whirlpool.c (_gcry_digest_spec_whirlpool): Ditto.
* cipher/md.c (md_enable): Do not allow combination of HMAC and
'expandable-output function'.
(md_final): Check if spec->read is NULL before calling.
(md_read): Ditto.
(md_extract, _gcry_md_extract): New.
* doc/gcrypt.texi: Add SHA3 algorithms and gcry_md_extract.
* src/cipher-proto.h (gcry_md_extract_t): New.
(gcry_md_spec_t): Add 'extract'.
* src/gcrypt-int.g (_gcry_md_extract): New.
* src/gcrypt.h.in (gcry_md_extract): New.
* src/libgcrypt.def: Add gcry_md_extract.
* src/libgcrypt.vers: Add gcry_md_extract.
* src/visibility.c (gcry_md_extract): New.
* src/visibility.h (gcry_md_extract): New.
--
Patch adds new interface for reading output from 'expandable-output
function' MD algorithms that can give variable length output (ie.
SHAKE algorithms from FIPS-202). New function to read output is
gpg_error_t gcry_md_extract(gcry_md_hd_t md, int algo,
void *buffer, size_t length);
Function implicitly finalizes algorithm so that no new input can
be given. Subsequents calls of the function return more output
bytes from the algorithm.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/md.c (prepare_macpads): Check hmac flag.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/Makefile.am: Add 'keccak_permute_32.h' and
'keccak_permute_64.h'.
* cipher/hash-common.h [USE_SHA3] (MD_BLOCK_MAX_BLOCKSIZE): Remove.
* cipher/keccak.c (USE_64BIT, USE_32BIT, USE_64BIT_BMI2)
(USE_64BIT_SHLD, USE_32BIT_BMI2, NEED_COMMON64, NEED_COMMON32BI)
(keccak_ops_t): New.
(KECCAK_STATE): Add 'state64' and 'state32bi' members.
(KECCAK_CONTEXT): Remove 'bctx'; add 'blocksize', 'count' and 'ops'.
(rol64, keccak_f1600_state_permute): Remove.
[NEED_COMMON64] (round_consts_64bit, keccak_extract_inplace64): New.
[NEED_COMMON32BI] (round_consts_32bit, keccak_extract_inplace32bi)
(keccak_absorb_lane32bi): New.
[USE_64BIT] (ANDN64, ROL64, keccak_f1600_state_permute64)
(keccak_absorb_lanes64, keccak_generic64_ops): New.
[USE_64BIT_SHLD] (ANDN64, ROL64, keccak_f1600_state_permute64_shld)
(keccak_absorb_lanes64_shld, keccak_shld_64_ops): New.
[USE_64BIT_BMI2] (ANDN64, ROL64, keccak_f1600_state_permute64_bmi2)
(keccak_absorb_lanes64_bmi2, keccak_bmi2_64_ops): New.
[USE_32BIT] (ANDN64, ROL64, keccak_f1600_state_permute32bi)
(keccak_absorb_lanes32bi, keccak_generic32bi_ops): New.
[USE_32BIT_BMI2] (ANDN64, ROL64, keccak_f1600_state_permute32bi_bmi2)
(pext, pdep, keccak_absorb_lane32bi_bmi2, keccak_absorb_lanes32bi_bmi2)
(keccak_extract_inplace32bi_bmi2, keccak_bmi2_32bi_ops): New.
(keccak_write): New.
(keccak_init): Adjust to KECCAK_CONTEXT changes; add implementation
selection based on HWF features.
(keccak_final): Adjust to KECCAK_CONTEXT changes; use selected 'ops'
for state manipulation.
(keccak_read): Adjust to KECCAK_CONTEXT changes.
(_gcry_digest_spec_sha3_224, _gcry_digest_spec_sha3_256)
(_gcry_digest_spec_sha3_348, _gcry_digest_spec_sha3_512): Use
'keccak_write' instead of '_gcry_md_block_write'.
* cipher/keccak_permute_32.h: New.
* cipher/keccak_permute_64.h: New.
--
Patch adds new generic 64-bit and 32-bit implementations and
optimized implementations for SHA3:
- Generic 64-bit implementation based on 'simple' implementation
from SUPERCOP package.
- Generic 32-bit bit-inteleaved implementataion based on
'simple32bi' implementation from SUPERCOP package.
- Intel BMI2 optimized variants of 64-bit and 32-bit BI
implementations.
- Intel SHLD optimized variant of 64-bit implementation.
Patch also makes proper use of sponge construction to avoid
use of addition input buffer.
Below are bench-slope benchmarks for new 64-bit implementations
made on Intel Core i5-4570 (no turbo, 3.2 Ghz, gcc-4.9.2).
Before (amd64):
SHA3-224 | 3.92 ns/B 243.2 MiB/s 12.55 c/B
SHA3-256 | 4.15 ns/B 230.0 MiB/s 13.27 c/B
SHA3-384 | 5.40 ns/B 176.6 MiB/s 17.29 c/B
SHA3-512 | 7.77 ns/B 122.7 MiB/s 24.87 c/B
After (generic 64-bit, amd64), 1.10x faster):
SHA3-224 | 3.57 ns/B 267.4 MiB/s 11.42 c/B
SHA3-256 | 3.77 ns/B 252.8 MiB/s 12.07 c/B
SHA3-384 | 4.91 ns/B 194.1 MiB/s 15.72 c/B
SHA3-512 | 7.06 ns/B 135.0 MiB/s 22.61 c/B
After (Intel SHLD 64-bit, amd64, 1.13x faster):
SHA3-224 | 3.48 ns/B 273.7 MiB/s 11.15 c/B
SHA3-256 | 3.68 ns/B 258.9 MiB/s 11.79 c/B
SHA3-384 | 4.80 ns/B 198.7 MiB/s 15.36 c/B
SHA3-512 | 6.89 ns/B 138.4 MiB/s 22.05 c/B
After (Intel BMI2 64-bit, amd64, 1.45x faster):
SHA3-224 | 2.71 ns/B 352.1 MiB/s 8.67 c/B
SHA3-256 | 2.86 ns/B 333.2 MiB/s 9.16 c/B
SHA3-384 | 3.72 ns/B 256.2 MiB/s 11.91 c/B
SHA3-512 | 5.34 ns/B 178.5 MiB/s 17.10 c/B
Benchmarks of new 32-bit implementations on Intel Core i5-4570
(no turbo, 3.2 Ghz, gcc-4.9.2):
Before (win32):
SHA3-224 | 12.05 ns/B 79.16 MiB/s 38.56 c/B
SHA3-256 | 12.75 ns/B 74.78 MiB/s 40.82 c/B
SHA3-384 | 16.63 ns/B 57.36 MiB/s 53.22 c/B
SHA3-512 | 23.97 ns/B 39.79 MiB/s 76.72 c/B
After (generic 32-bit BI, win32, 1.23x to 1.29x faster):
SHA3-224 | 9.76 ns/B 97.69 MiB/s 31.25 c/B
SHA3-256 | 10.27 ns/B 92.82 MiB/s 32.89 c/B
SHA3-384 | 13.22 ns/B 72.16 MiB/s 42.31 c/B
SHA3-512 | 18.65 ns/B 51.13 MiB/s 59.70 c/B
After (Intel BMI2 32-bit BI, win32, 1.66x to 1.70x faster):
SHA3-224 | 7.26 ns/B 131.4 MiB/s 23.23 c/B
SHA3-256 | 7.65 ns/B 124.7 MiB/s 24.47 c/B
SHA3-384 | 9.87 ns/B 96.67 MiB/s 31.58 c/B
SHA3-512 | 14.05 ns/B 67.85 MiB/s 44.99 c/B
Benchmarks of new 32-bit implementation on ARM Cortex-A8
(1008 Mhz, gcc-4.9.1):
Before:
SHA3-224 | 148.6 ns/B 6.42 MiB/s 149.8 c/B
SHA3-256 | 157.2 ns/B 6.07 MiB/s 158.4 c/B
SHA3-384 | 205.3 ns/B 4.65 MiB/s 206.9 c/B
SHA3-512 | 296.3 ns/B 3.22 MiB/s 298.6 c/B
After (1.56x faster):
SHA3-224 | 96.12 ns/B 9.92 MiB/s 96.89 c/B
SHA3-256 | 101.5 ns/B 9.40 MiB/s 102.3 c/B
SHA3-384 | 131.4 ns/B 7.26 MiB/s 132.5 c/B
SHA3-512 | 188.2 ns/B 5.07 MiB/s 189.7 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/sha1.c (sha1_init): Use HWF_INTEL_FAST_SHLD instead of
HWF_INTEL_CPU.
* cipher/sha256.c (sha256_init, sha224_init): Ditto.
* cipher/sha512.c (sha512_init, sha384_init): Ditto.
* src/g10lib.h (HWF_INTEL_FAST_SHLD): New.
(HWF_INTEL_BMI2, HWF_INTEL_SSSE3, HWF_INTEL_PCLMUL, HWF_INTEL_AESNI)
(HWF_INTEL_RDRAND, HWF_INTEL_AVX, HWF_INTEL_AVX2)
(HWF_ARM_NEON): Update.
* src/hwf-x86.c (detect_x86_gnuc): Add detection of Intel Core
CPUs with fast SHLD/SHRD instruction.
* src/hwfeatures.c (hwflist): Add "intel-fast-shld".
--
Intel Core CPUs since codename sandy-bridge have been able to
execute SHLD/SHRD instructions faster than rotate instructions
ROL/ROR. Since SHLD/SHRD can be used to do rotation, some
optimized implementations (SHA1/SHA256/SHA512) use SHLD/SHRD
instructions in-place of ROL/ROR.
This patch provides more accurate detection of CPUs with
fast SHLD implementation.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/camellia-glue.c (_gcry_camellia_aesni_avx_ocb_enc)
(_gcry_camellia_aesni_avx_ocb_dec, _gcry_camellia_aesni_avx_ocb_auth)
(_gcry_camellia_aesni_avx2_ocb_enc, _gcry_camellia_aesni_avx2_ocb_dec)
(_gcry_camellia_aesni_avx2_ocb_auth, _gcry_camellia_ocb_crypt)
(_gcry_camellia_ocb_auth): Change 'Ls' from pointer array to u64 array.
* cipher/serpent.c (_gcry_serpent_sse2_ocb_enc)
(_gcry_serpent_sse2_ocb_dec, _gcry_serpent_sse2_ocb_auth)
(_gcry_serpent_avx2_ocb_enc, _gcry_serpent_avx2_ocb_dec)
(_gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth): Ditto.
* cipher/twofish.c (_gcry_twofish_amd64_ocb_enc)
(_gcry_twofish_amd64_ocb_dec, _gcry_twofish_amd64_ocb_auth)
(twofish_amd64_ocb_enc, twofish_amd64_ocb_dec, twofish_amd64_ocb_auth)
(_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): Ditto.
--
Pointers on x32 are 32-bit, but amd64 assembly implementations
expect 64-bit pointers. Pass 'Ls' array to 64-bit integers so
that input arrays has correct format for assembly functions.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/md.c (struct gcry_md_context): Add flags.hmac.
Remove macpads and mcpads_Bsize.
(md_open): Initialize flags.hmac. Remove macpads initialization.
(md_enable): Allocate contexts when flags.hmac is enabled.
(md_copy): Remove macpads copying. Add copying contexts.
(_gcry_md_reset): When flags.hmac is enabled, restore precomputed
context with input pad
(md_close): Remove macpads wiping.
(md_final): When flags.hmac is enabled, compute hmac by precomputed
context with output pad.
(prepare_macpads): Prepare precomputed contexts with input pad and
output pad for each registered digest entry.
(_gcry_md_setkey): Just call prepare_macpads.
--
This change is making things straight in HMAC computation. This makes
HMAC computation allow multple algorithms in future.
Libgcrypt's code has a potential to compute digests for multiple
algorithms at once (currently, it's not enabled). HMAC code didn't
work well with multple algorithms, because the macpads were only
allocated for an algorithm. Now, it's allocated for each algorithm.
We now precompute hash contexts, instead of keeping input pad and
output pad. This can be performance improvement, which is described
in RFC 2104.
Thanks to:
Andrea Visconti, Simone Bossi, Hany Ragab and Alexandro Calò
For the discussion and their paper of CANS2015, which titled:
On the weaknesses of PBKDF2
|
|
* src/gcrypt-int.h (_gcry_sexp_extract_param): Revert the change.
* cipher/dsa.c (dsa_check_secret_key): Ditto.
* src/sexp.c (_gcry_sexp_extract_param): Return gpg_err_code_t.
* src/gcrypt-int.h (_gcry_err_make_from_errno)
(_gcry_error_from_errno): Return gpg_error_t.
* cipher/cipher.c (_gcry_cipher_open_internal)
(_gcry_cipher_ctl, _gcry_cipher_ctl): Don't use gcry_error.
* src/global.c (_gcry_vcontrol): Likewise.
* cipher/ecc-eddsa.c (_gcry_ecc_eddsa_genkey): Use
gpg_err_code_from_syserror.
* cipher/mac.c (mac_reset, mac_setkey, mac_setiv, mac_write)
(mac_read, mac_verify): Return gcry_err_code_t.
* cipher/rsa-common.c (mgf1): Use gcry_err_code_t for ERR.
* src/visibility.c (gcry_error_from_errno): Return gpg_error_t.
--
Reverting a part of 73374fdd and fix _gcry_sexp_extract_param
return type, instead.
Fix similar coding mistakes, throughout.
|
|
* cipher/rijndael-aesni.c (do_aesni_ctr_4): Split assembly block in
two parts to reduce number of register constraints needed.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* src/gcrypt-int.h (_gcry_sexp_extract_param): Return gpg_error_t.
* cipher/dsa.c (dsa_generate): Fix call to _gcry_sexp_extract_param.
* src/g10lib.h (_gcry_vcontrol): Return gcry_err_code_t.
* src/visibility.c (gcry_mpi_snatch): Fix call to _gcry_mpi_snatch.
--
GnuPG-bug-id: 2074
|
|
* cipher/cipher-selftest.c (_gcry_selftest_helper_cbc)
(_gcry_selftest_helper_cfb, _gcry_selftest_helper_ctr): Mark variable
as unused.
* random/rndw32.c (slow_gatherer): Avoid signed pointer mismatch
warning.
* src/secmem.c (init_pool): Avoid unused variable warning.
* tests/random.c (writen, readn): Include on if needed.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
* cipher/cipher-selftest.c (_gcry_cipher_selftest_alloc_ctx): New.
* cipher/rijndael.c (selftest_basic_128, selftest_basic_192)
(selftest_basic_256): Allocate context on the heap.
--
The stack alignment on Windows changed and because ld seems to limit
stack variables to a 8 byte alignment (we request 16), we get bus
errors from the selftests if AESNI is in use.
GnuPG-bug-id: 2085
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
* cipher/rsa.c (rsa_sign): Check the CRT.
--
Failures in the computation of the CRT (e.g. due faulty hardware) can
lead to a leak of the private key. The standard precaution against
this is to verify the signature after signing. GnuPG does this itself
and even has an option to disable this. However, the low performance
impact of this extra precaution suggest that it should always be done
and Libgcrypt is the right place here. For decryption is not done
because the application will detect the failure due to garbled
plaintext and in any case no key derived material will be send to the
user.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
* cipher/keccak.c (keccak_f1600_state_permute): Fix indexes for D[5].
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/camellia-glue.c (_gcry_camellia_ocb_crypt)
(_gcry_camellia_ocb_auth): Precalculate Ls array always, instead of
just if 'blkn % <parallel blocks> == 0'.
* cipher/serpent.c (_gcry_serpent_ocb_crypt)
(_gcry_serpent_ocb_auth): Ditto.
* cipher/rijndael-aesni.c (get_l): Remove low-bit checks.
(aes_ocb_enc, aes_ocb_dec, _gcry_aes_aesni_ocb_auth): Handle leading
blocks until block counter is multiple of 4, so that parallel block
processing loop can use 'c->u_mode.ocb.L' array directly.
* tests/basic.c (check_ocb_cipher_largebuf): Rename to...
(check_ocb_cipher_largebuf_split): ...this and add option to process
large buffer as two split buffers.
(check_ocb_cipher_largebuf): New.
--
Patch simplifies source and reduce object size.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/rijndael-aesni.c (do_aesni_ctr_4): Do addition using
CTR in big-endian form, if least-significant byte does not overflow.
--
Patch improves AES-NI CTR speed by 20%.
Benchmark on Intel Haswell (3.2 Ghz):
Before:
AES | nanosecs/byte mebibytes/sec cycles/byte
CTR enc | 0.273 ns/B 3489.8 MiB/s 0.875 c/B
CTR dec | 0.273 ns/B 3491.0 MiB/s 0.874 c/B
After:
CTR enc | 0.228 ns/B 4190.0 MiB/s 0.729 c/B
CTR dec | 0.228 ns/B 4190.2 MiB/s 0.729 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/hash-common.h (MD_BLOCK_MAX_BLOCKSIZE): Increase blocksize
USE_SHA3 enabled.
* cipher/keccak.c (SHA3_DELIMITED_SUFFIX, SHAKE_DELIMITED_SUFFIX): New.
(KECCAK_STATE): Add proper state.
(KECCAK_CONTEXT): Add 'outlen'.
(rol64, keccak_f1600_state_permute, transform_blk, transform): New.
(keccak_init): Add proper initialization.
(keccak_final): Add proper finalization.
(selftests_keccak): Add selftests.
(oid_spec_sha3_224, oid_spec_sha3_256, oid_spec_sha3_384)
(oid_spec_sha3_512): Add OID.
(_gcry_digest_spec_sha3_224, _gcry_digest_spec_sha3_256)
(_gcry_digest_spec_sha3_384, _gcry_digest_spec_sha3_512): Fix output
length.
* cipher/mac-hmac.c (map_mac_algo_to_md): Fix mapping for SHA3-512.
(hmac_get_keylen): Return proper blocksizes for SHA3 algorithms.
[USE_SHA3] (_gcry_mac_type_spec_hmac_sha3_224)
(_gcry_mac_type_spec_hmac_sha3_256, _gcry_mac_type_spec_hmac_sha3_384)
(_gcry_mac_type_spec_hmac_sha3_512): New.
* cipher/mac-internal [USE_SHA3] (_gcry_mac_type_spec_hmac_sha3_224)
(_gcry_mac_type_spec_hmac_sha3_256, _gcry_mac_type_spec_hmac_sha3_384)
(_gcry_mac_type_spec_hmac_sha3_512): New.
* cipher/mac.c (mac_list) [USE_SHA3]: Add SHA3 algorithms.
* cipher/md.c (md_open): Use proper SHA-3 blocksizes for HMAC macpads.
* tests/basic.c (check_digests): Add SHA3 test vectors.
--
Patch adds generic implementation for SHA3. Currently missing with this
patch:
- HMAC SHA3 test vectors, not available from NIST (yet?)
- ASNs
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/cipher-internal.h (ocb_get_l): New.
* cipher/cipher-ocb.c (_gcry_cipher_ocb_authenticate)
(ocb_crypt): Use 'ocb_get_l' instead of '_gcry_cipher_ocb_get_l'.
* cipher/camellia-glue.c (get_l): Remove.
(_gcry_camellia_ocb_crypt, _gcry_camellia_ocb_auth): Precalculate
offset array when block count matches parallel operation size; Use
'ocb_get_l' instead of 'get_l'.
* cipher/rijndael-aesni.c (get_l): Add fast path for 75% most common
offsets.
(aesni_ocb_enc, aesni_ocb_dec, _gcry_aes_aesni_ocb_auth): Precalculate
offset array when block count matches parallel operation size.
* cipher/rijndael-ssse3-amd64.c (get_l): Add fast path for 75% most
common offsets.
* cipher/rijndael.c (_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth): Use
'ocb_get_l' instead of '_gcry_cipher_ocb_get_l'.
* cipher/serpent.c (get_l): Remove.
(_gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth): Precalculate
offset array when block count matches parallel operation size; Use
'ocb_get_l' instead of 'get_l'.
* cipher/twofish.c (get_l): Remove.
(_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): Use 'ocb_get_l'
instead of 'get_l'.
--
Patch optimizes OCB offset calculation for generic code and
assembly implementations with parallel block processing.
Benchmark of OCB AES-NI on Intel Haswell:
$ tests/bench-slope --cpu-mhz 3201 cipher aes
Before:
AES | nanosecs/byte mebibytes/sec cycles/byte
CTR enc | 0.274 ns/B 3483.9 MiB/s 0.876 c/B
CTR dec | 0.273 ns/B 3490.0 MiB/s 0.875 c/B
OCB enc | 0.289 ns/B 3296.1 MiB/s 0.926 c/B
OCB dec | 0.299 ns/B 3189.9 MiB/s 0.957 c/B
OCB auth | 0.260 ns/B 3670.0 MiB/s 0.832 c/B
After:
AES | nanosecs/byte mebibytes/sec cycles/byte
CTR enc | 0.273 ns/B 3489.4 MiB/s 0.875 c/B
CTR dec | 0.273 ns/B 3487.5 MiB/s 0.875 c/B
OCB enc | 0.248 ns/B 3852.8 MiB/s 0.792 c/B
OCB dec | 0.261 ns/B 3659.5 MiB/s 0.834 c/B
OCB auth | 0.227 ns/B 4205.5 MiB/s 0.726 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/ecc.c (check_secret_key): Y1 should not be NULL when check.
(ecc_check_secret_key): Support Montgomery curve.
* mpi/ec.c (_gcry_mpi_ec_curve_point): Fix condition.
|
|
* src/gcrypt.h.in (GCRY_MD_SHA3_224, GCRY_MD_SHA3_256)
(GCRY_MD_SHA3_384, GCRY_MD_SHA3_512): New.
(GCRY_MAC_HMAC_SHA3_224, GCRY_MAC_HMAC_SHA3_256)
(GCRY_MAC_HMAC_SHA3_384, GCRY_MAC_HMAC_SHA3_512): New.
* cipher/keccak.c: New with stub functions.
* cipher/Makefile.am (EXTRA_libcipher_la_SOURCES): Add keccak.c.
* configure.ac (available_digests): Add sha3.
(USE_SHA3): New.
* src/fips.c (run_hmac_selftests): Add SHA3 to the required selftests.
* cipher/md.c (digest_list) [USE_SHA3]: Add standard SHA3 algos.
(md_open): Ditto for hmac processing.
* cipher/mac-hmac.c (map_mac_algo_to_md): Add mapping.
* cipher/hmac-tests.c (run_selftests): Prepare for tests.
* cipher/pubkey-util.c (get_hash_algo): Add "sha3-xxx".
--
Note that the algo GCRY_MD_SHA3_xxx are prelimanry. We should try to
sync them with OpenPGP.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
* cipher/ecc-eddsa.c (_gcry_ecc_eddsa_sign): Init DISGEST and goto
leave on error.
--
Fixing an issue found by static analysis.
Signed-off-by: Ismo Puustinen <ismo.puustinen@intel.com>
Added DIGEST init and wrote Changelog.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
* cipher/ecc-curves.c (curve_aliases, domain_parms): Add Curve25519.
* tests/curves.c (N_CURVES): It's 22 now.
* src/cipher.h (PUBKEY_FLAG_DJB_TWEAK): New.
* cipher/ecc-common.h (_gcry_ecc_mont_decodepoint): New.
* cipher/ecc-misc.c (_gcry_ecc_mont_decodepoint): New.
* cipher/ecc.c (nist_generate_key): Handle the case of
PUBKEY_FLAG_DJB_TWEAK and Montgomery curve.
(test_ecdh_only_keys, check_secret_key): Likewise.
(ecc_generate): Support Curve25519 which is Montgomery curve with flag
PUBKEY_FLAG_DJB_TWEAK and PUBKEY_FLAG_COMP.
(ecc_encrypt_raw): Get flags from KEYPARMS and handle
PUBKEY_FLAG_DJB_TWEAK and Montgomery curve.
(ecc_decrypt_raw): Likewise.
(compute_keygrip): Handle the case of PUBKEY_FLAG_DJB_TWEAK.
* cipher/pubkey-util.c (_gcry_pk_util_parse_flaglist):
PUBKEY_FLAG_EDDSA implies PUBKEY_FLAG_DJB_TWEAK.
Parse "djb-tweak" for PUBKEY_FLAG_DJB_TWEAK.
--
With PUBKEY_FLAG_DJB_TWEAK, secret key has msb set and it should be
always multiple by cofactor.
|
|
* cipher/twofish.c (poly_to_exp): Increase size by one, change type
from byte to u16 and insert '492' to index 0.
(exp_to_poly): Increase size by 256, let new cells have zero value.
(CALC_S): Execute unconditionally with help of modified tables.
(do_twofish_setkey): Change type for 'tmp' to 'unsigned int'; Un-unroll
CALC_K256 and CALC_K phases to reduce generated object size.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/cipher-ocb.c (_gcry_cipher_ocb_authenticate)
(ocb_crypt): Change bulk function to return number of unprocessed
blocks.
* src/cipher.h (_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth)
(_gcry_camellia_ocb_crypt, _gcry_camellia_ocb_auth)
(_gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth)
(_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): Change return type
to 'size_t'.
* cipher/camellia-glue.c (get_l): Only if USE_AESNI_AVX or
USE_AESNI_AVX2 defined.
(_gcry_camellia_ocb_crypt, _gcry_camellia_ocb_auth): Change return type
to 'size_t' and return remaining blocks; Remove unaccelerated common
code path. Enable remaining common code only if USE_AESNI_AVX or
USE_AESNI_AVX2 defined; Remove unaccelerated common code.
* cipher/rijndael.c (_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth): Change
return type to 'size_t' and return zero.
* cipher/serpent.c (get_l): Only if USE_SSE2, USE_AVX2 or USE_NEON
defined.
(_gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth): Change return type
to 'size_t' and return remaining blocks; Remove unaccelerated common
code path. Enable remaining common code only if USE_SSE2, USE_AVX2 or
USE_NEON defined; Remove unaccelerated common code.
* cipher/twofish.c (get_l): Only if USE_AMD64_ASM defined.
(_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): Change return type
to 'size_t' and return remaining blocks; Remove unaccelerated common
code path. Enable remaining common code only if USE_AMD64_ASM defined;
Remove unaccelerated common code.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/cipher.c (_gcry_cipher_open_internal): Setup OCB bulk
functions for Serpent.
* cipher/serpent-armv7-neon.S: Add OCB assembly functions.
* cipher/serpent-avx2-amd64.S: Add OCB assembly functions.
* cipher/serpent-sse2-amd64.S: Add OCB assembly functions.
* cipher/serpent.c (_gcry_serpent_sse2_ocb_enc)
(_gcry_serpent_sse2_ocb_dec, _gcry_serpent_sse2_ocb_auth)
(_gcry_serpent_neon_ocb_enc, _gcry_serpent_neon_ocb_dec)
(_gcry_serpent_neon_ocb_auth, _gcry_serpent_avx2_ocb_enc)
(_gcry_serpent_avx2_ocb_dec, _gcry_serpent_avx2_ocb_auth): New
prototypes.
(get_l, _gcry_serpent_ocb_crypt, _gcry_serpent_ocb_auth): New.
* src/cipher.h (_gcry_serpent_ocb_crypt)
(_gcry_serpent_ocb_auth): New.
* tests/basic.c (check_ocb_cipher): Add test-vector for serpent.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/cipher.c (_gcry_cipher_open_internal): Setup OCB bulk
functions for Twofish.
* cipher/twofish-amd64.S: Add OCB assembly functions.
* cipher/twofish.c (_gcry_twofish_amd64_ocb_enc)
(_gcry_twofish_amd64_ocb_dec, _gcry_twofish_amd64_ocb_auth): New
prototypes.
(call_sysv_fn5, call_sysv_fn6, twofish_amd64_ocb_enc)
(twofish_amd64_ocb_dec, twofish_amd64_ocb_auth, get_l)
(_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): New.
* src/cipher.h (_gcry_twofish_ocb_crypt)
(_gcry_twofish_ocb_auth): New.
* tests/basic.c (check_ocb_cipher): Add test-vector for Twofish.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/camellia-aesni-avx-amd64.S: Add OCB assembly functions.
* cipher/camellia-aesni-avx2-amd64.S: Add OCB assembly functions.
* cipher/camellia-glue.c (_gcry_camellia_aesni_avx_ocb_enc)
(_gcry_camellia_aesni_avx_ocb_dec, _gcry_camellia_aesni_avx_ocb_auth)
(_gcry_camellia_aesni_avx2_ocb_enc, _gcry_camellia_aesni_avx2_ocb_dec)
(_gcry_camellia_aesni_avx2_ocb_auth): New prototypes.
(get_l, _gcry_camellia_ocb_crypt, _gcry_camellia_ocb_auth): New.
* cipher/cipher.c (_gcry_cipher_open_internal): Setup OCB bulk
functions for Camellia.
* src/cipher.h (_gcry_camellia_ocb_crypt)
(_gcry_camellia_ocb_auth): New.
* tests/basic.c (check_ocb_cipher): Add test-vector for Camellia.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/rijndael-ssse3-amd64.c (SSSE3_STATE_SIZE): New.
[HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS] (vpaes_ssse3_prepare): Use
'ssse3_state' for storing current SSSE3 state.
[HAVE_COMPATIBLE_GCC_WIN64_PLATFORM_AS]
(vpaes_ssse3_cleanup): Restore SSSE3 state from 'ssse3_state'.
(_gcry_aes_ssse3_do_setkey, _gcry_aes_ssse3_prepare_decryption)
(_gcry_aes_ssse3_encrypt, _gcry_aes_ssse3_cfb_enc)
(_gcry_aes_ssse3_cbc_enc, _gcry_aes_ssse3_ctr_enc)
(_gcry_aes_ssse3_decrypt, _gcry_aes_ssse3_cfb_dec)
(_gcry_aes_ssse3_cbc_dec, _gcry_aes_ssse3_cbc_dec): Add 'ssse3_state'
array.
(get_l, ssse3_ocb_enc, ssse3_ocb_dec, _gcry_aes_ssse3_ocb_crypt)
(_gcry_aes_ssse3_ocb_auth): New.
* cipher/rijndael.c (_gcry_aes_ssse3_ocb_crypt)
(_gcry_aes_ssse3_ocb_auth): New.
(_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth) [USE_SSSE3]: Use SSSE3
implementation for OCB.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/cipher-gcm.c: Do not copy zero bytes from an empty buffer. Let
the function continue to add padding as needed though.
* cipher/mac-poly1305.c: If the caller requested to finish the hash
function without a copy of the result, return immediately.
--
Caught by UndefinedBehaviorSanitizer.
Signed-off-by: Peter Wu <peter@lekensteyn.nl>
|
|
* cipher/rsa.c: Fix.
--
Signed-off-by: Peter Wu <peter@lekensteyn.nl>
|