Commit 16acbcbc authored by Leo Le Bouter's avatar Leo Le Bouter

README: list used hashing algorithms, add benchmark results

parent 001ed5c5
......@@ -5,9 +5,27 @@ In the context of the project [GNU/Linux System files on-boot Tamper Detection S
## Current performance properties
- Reads file system metadata from the main thread (stat, xattrs (SELinux, ..), POSIX ACLs)
- Reads files and hashes them across multiple processes (as many as core count) with the `multiprocessing` python module
- Reads files and hashes them in md5, sha1, sha256 and sha512 across multiple processes (as many as core count) with the `multiprocessing` python module
- Maximizes disk I/O utilization successfully, the Python code's performance is not a bottleneck, the disk is (good sign)
Tested on a laptop with:
- 3.2 GB/s read NVMe SSD
- Intel(R) Core(TM) i7-1065G7 CPU (4 cores, 8 threads) @ 1.30 GHz min / 3.90 GHz max
- 2 GHz per thread on average under full multithreaded load due to heat and unoptimal laptop thermals (Dell XPS 13 2020)
- For ~1 million files on EXT4 over LUKS+LVM and ~140GB occupied disk space:
```
real 6m11.532s
user 31m7.676s
sys 3m27.251s
```
6 minutes and 12 seconds of real world time
This will hardly get any better because the disk is the bottleneck, CPU usage is not full but disk I/O utilization is, peaking at 500 MB/s reads for these test conditions. 3.2 GB/s on this SSD is for sequential reads (optimal conditions).
It can and probably will be faster on performant servers with less files, less disk space usage, more CPU cores and similar disk.
## Desired performance properties
- Reduce memory usage
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment