A parallel checksum utility using a Merkle tree, designed for huge files.
With the latest PCIe 4.0 and 5.0 SSDs, a single processor thread is insufficient to fully utilize their bandwidth. This utility leverages multiple threads to compute a file’s checksum in parallel using a Merkle tree structure, enabling efficient checksum calculations for large files in a reasonable time.
Usage: mtsum [--help] [--version] [-p processors] [-a algorithm] path
Positional arguments:
path path to input file [required]
Optional arguments:
-h, --help shows help message and exits
--version prints version information and exits
-p number of processors to use [default: 8]
-a hashing algorithm to use, supported algorithms are md5, sha1, sha256, sha384, sha512 [default: "sha256"]
-g output the merkle tree as DOT graph
Misc options (detailed usage):
-b enable benchmark
-v enable verbose output
~4.2x faster than Get-FileHash
on a ~183 GiB file.
- OS: Windows 11 Pro 24H2
- CPU: Intel i9-13900KF
- RAM: 64GB Dual-Channel DDR4-3200
- SSD: WD Black SN850X 4TB PCIe 4.0 (Max Seq. Read: 7,300 MB/s)
PS > Measure-Command { mtsum -v ... | Out-Default }
Algorithm: sha256
Number of processors: 8
File size: 196502093824 bytes
c5750c570206464ed6d9b2ef8d290a42fcb8121f97a803c6510ecca5b43ee699
32.99 s (5.96 GB/s)
...
TotalSeconds : 33.1166517
...
PS > Measure-Command { Get-FileHash ... | Out-Default }
...
TotalSeconds : 138.0812053
4.4x faster than sha256sum
on a ~165 GiB file.
- OS: Debian GNU/Linux trixie/sid
- CPU: AMD EPYC 7203P
- RAM: 512GB Eight-Channel DDR4-3200
- SSD: Micron 7450 Pro 7.68TB U.3 Enterprise SSD (Max Seq. Read: 6,800 MB/s)
$ time ./mtsum -v ...
Algorithm: sha256
Number of processors: 8
File size: 177652487485 bytes
26d9ced146e549ecb6848d421a9f4f483206c57a9428d9232af7984db84c4f3b
27.62 s (6.43 GB/s)
real 0m27.634s
user 1m49.710s
sys 0m3.348s
$ time sha256sum ...
5ce5b397d323cde668b77c08e17c48f6a5b6972671aa401d33e91faf1e366048 ...
real 2m2.146s
user 1m44.980s
sys 0m17.137s
- CMake 3.20 or higher
- vcpkg
- make or ninja
- Any C++ compiler that supports C++20 or higher
Note: vcpkg will automatically download and install the dependencies for you.
- Run
cmake --preset=release-ninja
orcmake --preset=release-make
to generate the build files. - Run
cd cmake-build-release && make
orcd cmake-build-release && ninja
in to build the project.
- Run
cmake --preset=release-ninja-static
orcmake --preset=release-make-static
to generate the build files. - Run
cd cmake-build-release-static && make
orcd cmake-build-release-static && ninja
in to build the project.
This project is developed under the direction of Dr. Jaroslaw Zola.