summaryrefslogtreecommitdiff
path: root/digest.py
AgeCommit message (Collapse)AuthorFilesLines
2014-10-24digest.py: use threading insteadPeter Wu1-7/+8
At first I used multithreading because threading would still run with one CPU core due to the GIL. That probably happened because I accessed _hashes[algos] in the loop of _queue_updater. Now that this is not done anymore, and only the hash update function is called which releases the GIL for data larger than 2 KiB, multiple cores are actually used. For comparison, for a file of 2.3 GiB (min/avg/max/sd secs for n=10): - pee sha256sum md5sum < file: 16.5/16.9/17.4/.305 - python3 digest.py -sha256 -md5 < file: 13.7/15.0/18.7/1.77 - python2 digest.py -sha256 -md5 < file: 13.7/15.9/18.7/1.64 - jacksum -a sha256+md5 -F '#CHECKSUM{i} #FILENAME': 32.7/37.1/50/6.91 The file is actually 2367029248 bytes, resident in the disk cache. Environment: - CPU: Intel i5-460M - Arch Linux x86_64 - Linux 3.17-rc4 - coreutils 8.23 - moreutils 0.51 - jacksum 1.7.0 on OpenJDK 1.7.0_71 - Python 3.4.2, Python 2.7.8
2014-10-24digest.py: calculate multiple digestsPeter Wu1-0/+144
Created to "Simultaneously calculate multiple digests (md5, sha256)", http://unix.stackexchange.com/q/163747/8250 This implementation has a simple single-threaded class (Hasher) and a multi-threaded one. Currently uses the multithreading.Queue interface.