Commit ff2f0b67 authored by Kirill Smelkov's avatar Kirill Smelkov

restore: Extract packs in multiple workers

This way it allows us to leverage multiple CPUs on a system for pack
extractions, which are computation-heavy operations.

The way to do is more-or-less classical:

    - main worker prepares requests for pack extraction jobs

    - there are multiple pack-extraction workers, which read requests
      from jobs queue and perform them

    - at the end we wait for everything to stop, collect errors and
      optionally signalling the whole thing to cancel if we see an error
      coming. (it is only a signal and we still have to wait for
      everything to stop)

The default number of workers is N(CPU) on the system - because we spawn
separate `git pack-objects ...` for every request.

We also now explicitly limit N(CPU) each `git pack-objects ...` can use
to 1. This way control how many resources to use is in git-backup hand
and also git packs better this way (when only using 1 thread) because
when deltifying all objects are considered to each other, not only all
objects inside 1 thread's object poll, and even when pack.threads is not
1, first "objects counting" phase of pack is serial - wasting all but 1
core.

On lab.nexedi.com we already use pack.threads=1 by default in global
gitconfig, but the above change is for code to be universal.

Time to restore nexedi/ from lab.nexedi.com backup:

2CPU laptop:

    before (pack.threads=1)     10m11s
    before (pack.threads=NCPU)   9m13s
    after  -j1                  10m11s
    after                        6m17s

8CPU system (with other load present, noisy) :

    before (pack.threads=1)     ~5m
    after                       ~1m30s
parent 6c2abbbf
...@@ -251,3 +251,18 @@ func erraddcallingcontext(topfunc string, e *Error) *Error { ...@@ -251,3 +251,18 @@ func erraddcallingcontext(topfunc string, e *Error) *Error {
return e return e
} }
// error merging multiple errors (e.g. after collecting them from several parallel workers)
type Errorv []error
func (ev Errorv) Error() string {
if len(ev) == 1 {
return ev[0].Error()
}
msg := fmt.Sprintf("%d errors:\n", len(ev))
for _, e := range ev {
msg += fmt.Sprintf("\t- %s\n", e)
}
return msg
}
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment