In https://github.com/matomo-org/matomo/pull/12702 we added a check to not run the same archiver twice. However, we still noticed the same command appearing multiple times.
After looking through the code for a while I realized that we start max 3 concurrent archivers at a time. And if an option is set, we might even only start 1 archiver at a time.
When archiving any period, we generate the archiving URL for the period plus all segments. If there are 9 segments, there would be 10 URLs. The thing is that we checked at URL generation time whether an URL is already being executed or not. But because we run max 3 jobs concurrent, an archiving job might be only started much later when all previous requests have finished.
This means we should also check at the time just shortly before the execution of an archiver whether such a job is already running.
I have to say I have not tested the script yet but it should work. For now seeing if the tests succeed.
The archiving process is hard to reason about in general, but looks good to me. left a comment in slack, just a thought though.
FYI we ran some tests with it and worked quite nicely.