Rewrite in Rust to obtain standalone static binary

In contradiction with Jean-Paul's guidelines on not using Rust due to lack of knowledge about it inside Nexedi, I am using it here because it is the fastest way for me to get a working standalone static binary, I know that language best. Considering we must be getting results ASAP, this is the best strategy for me. We may later rewrite it in another language if necessary. A shell script is included to build the static binary, you need to install rustup to get rust for musl, an alternative libc that allows to create real static binaries that embed libc itself too. Rustup can be found at: https://rustup.rs/ You can get a musl toolchain with: $ rustup target add x86_64-unknown-linux-musl The acl library is being downloaded and built as a static library by the script, and the rust build system will also build a vendored copy of openssl as a static library. Parallel hashing is done a bit differently in that Rust version, only files contained in the currently processed directories will be hashed in parallel. If there is a single big file in a directory hashing will be stuck on that file until it's done and it goes onto the next directory. To clarify, each file is only hashed on a single thread, the Python version also does this, it just keeps the number of files being hashed in parallel to a constant number as long as there is more files to process, this version will only hash with one thread per file in the currently processed directory. It was done that way for sake of simplicity but we can implement an offload threadpool to mimick what was done in Python later on.

Rewrite in Rust to obtain standalone static binary
In contradiction with Jean-Paul's guidelines on not using Rust due to lack of knowledge about it inside Nexedi, I am using it here because it is the fastest way for me to get a working standalone static binary, I know that language best. Considering we must be getting results ASAP, this is the best strategy for me. We may later rewrite it in another language if necessary. A shell script is included to build the static binary, you need to install rustup to get rust for musl, an alternative libc that allows to create real static binaries that embed libc itself too. Rustup can be found at: https://rustup.rs/ You can get a musl toolchain with: $ rustup target add x86_64-unknown-linux-musl The acl library is being downloaded and built as a static library by the script, and the rust build system will also build a vendored copy of openssl as a static library. Parallel hashing is done a bit differently in that Rust version, only files contained in the currently processed directories will be hashed in parallel. If there is a single big file in a directory hashing will be stuck on that file until it's done and it goes onto the next directory. To clarify, each file is only hashed on a single thread, the Python version also does this, it just keeps the number of files being hashed in parallel to a constant number as long as there is more files to process, this version will only hash with one thread per file in the currently processed directory. It was done that way for sake of simplicity but we can implement an offload threadpool to mimick what was done in Python later on.
d2277063 · Leo Le Bouter · 4256ada0 · d2277063 · d2277063 · d2277063
Commit d2277063 authored Aug 20, 2020 by Leo Le Bouter
Showing with 1979 additions and 1 deletion

.gitignore .gitignore +6 -1

Cargo.lock Cargo.lock +1560 -0

Cargo.toml Cargo.toml +27 -0

rust-build-static.bash rust-build-static.bash +28 -0

src/main.rs src/main.rs +358 -0

No files found.
--- a/.gitignore
+++ b/.gitignore
@@ -2,4 +2,9 @@ test
 test_stderr
 /env
 /.vscode
 /test_dir
\ No newline at end of file
+/acl-*
+# Added by cargo
+/target
--- a/Cargo.lock
+++ b/Cargo.lock
--- a/Cargo.toml
+++ b/Cargo.toml
+[package]
+name = "metadata-collect-agent"
+version = "0.1.0"
+authors = ["Leo Le Bouter <leo.le.bouter@nexedi.com>"]
+edition = "2018"
+[dependencies]
+posix-acl = "1.0.0"
+xattr = "0.2.2"
+md-5 = "0.9.1"
+sha-1 = "0.9.1"
+sha2 = "0.9.1"
+hex = "0.4.2"
+anyhow = "1.0.32"
+clap = "2.33.3"
+psutil = { git = "https://github.com/leo-lb/rust-psutil", branch = "lle-bout/impl-serde", version = "3.1.0", features = ["serde"] }
+reqwest = { version = "0.10.7", features = ["blocking", "native-tls-vendored"] }
+rmp-serde = "0.14.4"
+nix = "0.18.0"
+serde = { version = "1.0.115", features = ["derive"] }
+base64 = "0.12.3"
+rayon = "1.3.1"
+[profile.release]
+opt-level = 'z'
+lto = true
+codegen-units = 1
\ No newline at end of file
--- a/rust-build-static.bash
+++ b/rust-build-static.bash
+#!/bin/bash
+set -eux
+ACLVERSION="2.2.53"
+ACLTARGZSHA256=9e905397ac10d06768c63edd0579c34b8431555f2ea8e8f2cee337b31f856805
+HOST_TARGET=${HOST_TARGET:-x86_64-unknown-linux-musl}
+wget -O "acl-${ACLVERSION}.tar.gz" \
+	"https://git.savannah.nongnu.org/cgit/acl.git/snapshot/acl-${ACLVERSION}.tar.gz"
+echo -n "$ACLTARGZSHA256  acl-${ACLVERSION}.tar.gz" | sha256sum -c /dev/stdin
+tar -xvf "acl-${ACLVERSION}.tar.gz"
+cd "acl-${ACLVERSION}"
+[ -f ./configure ] || ./autogen.sh
+[ -f ./config.status ] || ./configure --host "${HOST_TARGET}"
+make -j$(nproc)
+cd -
+RUSTFLAGS="-L native=$(pwd)/acl-${ACLVERSION}/.libs -l static=acl" PKG_CONFIG_ALLOW_CROSS=1 \
+	cargo build --target "$HOST_TARGET" --release
+strip --strip-all "target/$HOST_TARGET/release/$(basename $(pwd))"
+objdump -T "target/$HOST_TARGET/release/$(basename $(pwd))"
--- a/src/main.rs
+++ b/src/main.rs