Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure to cancel an archive process if it is already running #13155

Merged
merged 4 commits into from Jul 23, 2018

Conversation

tsteur
Copy link
Member

@tsteur tsteur commented Jul 10, 2018

In #12702 we added a check to not run the same archiver twice. However, we still noticed the same command appearing multiple times.

After looking through the code for a while I realized that we start max 3 concurrent archivers at a time. And if an option is set, we might even only start 1 archiver at a time.

When archiving any period, we generate the archiving URL for the period plus all segments. If there are 9 segments, there would be 10 URLs. The thing is that we checked at URL generation time whether an URL is already being executed or not. But because we run max 3 jobs concurrent, an archiving job might be only started much later when all previous requests have finished.

This means we should also check at the time just shortly before the execution of an archiver whether such a job is already running.

I have to say I have not tested the script yet but it should work. For now seeing if the tests succeed.

@tsteur tsteur added the Pull Request WIP Indicates the current pull request is still work in progress and not ready yet for a review. label Jul 10, 2018
@tsteur tsteur added this to the 3.6.0 milestone Jul 10, 2018
@@ -1873,6 +1901,7 @@ private function makeCliMulti()
$cliMulti->setUrlToPiwik($this->urlToPiwik);
$cliMulti->setPhpCliConfigurationOptions($this->phpCliConfigurationOptions);
$cliMulti->setAcceptInvalidSSLCertificate($this->acceptInvalidSSLCertificate);
$cliMulti->setConcurrentProcessesLimit($this->getConcurrentRequestsPerWebsite());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change was in theory not needed because we only set multiple URLs where it was already set, but ensures we always respect max concurrent requests is being respected.

@tsteur tsteur added Needs Review PRs that need a code review and removed Pull Request WIP Indicates the current pull request is still work in progress and not ready yet for a review. labels Jul 10, 2018
@tsteur tsteur requested review from diosmosis and mattab July 10, 2018 03:18
$periodInProgress = $this->isAlreadyArchivingAnyLowerOrThisPeriod($idSite, $period);
if ($periodInProgress) {
$this->logger->info("- skipping archiving for period '{period}' because processing the period '{periodcheck}' is already in progress.", array('period' => $period, 'periodcheck' => $periodInProgress));
if ($this->isAlreadyArchivingUrl($url, $idSite, $period, $date)) {
Copy link
Member Author

@tsteur tsteur Jul 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory we wouldn't need this check and could rely on the one in ->before() but figured it might be good to have as when it was just being processed, we do not need to process it later again even when later the job is not running anymore.


if ($cliMulti->isCommandAlreadyRunning($urlWithSegment)) {
$this->logger->info("- skipping segment archiving for '{segment}' because such a process is already in progress.", array('segment' => $segment));
if ($this->isAlreadyArchivingSegment($urlWithSegment, $idSite, $period, $segment)) {
Copy link
Member Author

@tsteur tsteur Jul 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory we wouldn't need this check and could rely on the one in ->before() but figured it might be good to have as when it was just being processed, we do not need to process it later again even when later the job is not running anymore.

$self = $this;
$request = new Request($url);
$request->before(function () use ($self, $url, $idSite, $period, $date) {
if ($self->isAlreadyArchivingUrl($url, $idSite, $period, $date)) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by the time the job is being executed, some seconds, minutes, or hours might have passed and we need to check again if we should execute the job before starting it.

@diosmosis
Copy link
Member

The archiving process is hard to reason about in general, but looks good to me. left a comment in slack, just a thought though.

if ($self->isAlreadyArchivingSegment($urlWithSegment, $idSite, $period, $segment)) {
return Request::ABORT;
}

$processedSegmentCount++;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually wondering if I should increase $processedSegmentCount before the if even when aborted? @mattab ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it changes anything as it seems only used for logging info on screen. Since it's a very edge case, it does not really matter. Would probably leave it as it is?

@tsteur
Copy link
Member Author

tsteur commented Jul 10, 2018

FYI we ran some tests with it and worked quite nicely.

@diosmosis diosmosis merged commit 3900cb3 into 3.x-dev Jul 23, 2018
@diosmosis diosmosis deleted the preventdoublearchiving branch July 23, 2018 02:31
InfinityVoid pushed a commit to InfinityVoid/matomo that referenced this pull request Oct 11, 2018
…org#13155)

* Ensure to cancel an archive process if it is already running

* make sure to still launch segments when lower period is already active
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Review PRs that need a code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants