core:archive Better detection for end of sharedSiteId queue #16769

tsteur · 2020-11-22T21:42:56Z

Description:

By default core:archive uses SharedSiteIds which is used by multiple workers. For example it has a queue of several idSites that need to be processed. As soon as this queue has been reset we want existing core:archive commands to finish next time getNextSiteId is called. This is to not have one core:archive command running forever potentially running into memory issues etc.

So far we have two detections when queue was reset

When the queue is empty https://github.com/matomo-org/matomo/blob/4.0.0-rc6/core/CronArchive/SharedSiteIds.php#L171-L174
When suddenly more sites are in the queue than before https://github.com/matomo-org/matomo/blob/4.0.0-rc6/core/CronArchive/SharedSiteIds.php#L176-L180 (this works eg when there are only 1 sites in the queue, then a different core:archive resets the list of all idSites and there are 50 sites in the queue now. This means there must have been a reset)
Above check does not work though when eg core:archive is running hourly, and there is only one site in Matomo (so the queue has always only one site) and it takes > 1 hour to archive this site. What happens archiver A inits sharedSiteIds with idSites=[1] then it processed first site so queue is [], then other core:archive might set it to sharedSiteIds=[1] again. It's bit edge case and there probably would need to be some race conditions as the next archiver should technically empty the queue as well.

Just to be safe thought it might be good to check that getNextSiteId doesn't return a siteId that it already returned before. Because this would mean we have already archived that site.

Review

Functional review done
Usability review done (is anything maybe unclear or think about anything that would cause people to reach out to support)
Security review done see checklist
Code review done
Tests were added if useful/possible
Reviewed for breaking changes
Developer changelog updated if needed
Documentation added if needed
Existing documentation updated if needed

diosmosis · 2020-11-23T03:47:33Z

This wouldn't work if the next site, for some reason (maybe due to multiple archivers running), isn't in the already processed queue, correct? What about setting an option w/ the timestamp when the queue is created/reset, then checking that this hasn't changed getNextSiteId() is called?

tsteur · 2020-11-23T19:55:30Z

@diosmosis changed the PR 👍

diosmosis · 2020-11-24T01:24:52Z

core/CronArchive/SharedSiteIds.php

+            $this->currentSiteId = null;
+            $this->done = true;
+            return null;
+        }


Should we maybe log here? Maybe a debug log or an info, since it seems like it might be useful to have in CLI output...

Not sure it's needed as we would simply stop once iterated over all sites? Not sure how to add easily...

I mean a quick log, like, The shared site ID queue was reset, stopping. (or whatever actually happens if not stopping). Might be useful when looking through the logs?

@diosmosis added a debug log message. It should usually not be that important since it's a regular/normal thing maybe

better to use LoggerInterface instead, but otherwise looks good

diosmosis · 2020-11-24T01:34:13Z

core/CronArchive/SharedSiteIds.php

+
+    public function setQueueWasReset()
+    {
+        Option::set($this->optionName . self::KEY_TIMESTAMP, floor(microtime(true) * 1000));


Not an issue, but curious why microtime is used instead of just time()? Something to do with server configurations?

because time would cause maybe concurrency issues not sure. It's a common practice that some users launch multiple archivers at the same second so just wanted to have it more precise just to be safe. Probably wouldn't cause an issue to use time()

diosmosis

Looks good, left a couple minor comments

diosmosis · 2020-12-14T02:06:06Z

@tsteur merged after this small tweak: 0bba960

tsteur added 2 commits November 23, 2020 10:34

Better detection for end of sharedSiteId queue

a526477

simpler solution

f588568

tsteur added the Needs Review PRs that need a code review label Nov 22, 2020

tsteur added this to the 4.1.0 milestone Nov 22, 2020

simpler solution

c857f45

tsteur added 2 commits November 24, 2020 08:53

use timstamp to detect reset

28df512

use timstamp to detect reset

922942d

diosmosis reviewed Nov 24, 2020

View reviewed changes

diosmosis approved these changes Nov 24, 2020

View reviewed changes

tsteur and others added 3 commits November 25, 2020 08:55

add debug log message

abf4ea5

cast to int since we do a strict not equals

0bba960

Merge branch '4.x-dev' into detectemptyqueue

4662074

diosmosis merged commit abf3475 into 4.x-dev Dec 14, 2020

diosmosis deleted the detectemptyqueue branch December 14, 2020 02:06

mattab changed the title ~~Better detection for end of sharedSiteId queue~~ core:archive Better detection for end of sharedSiteId queue Dec 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core:archive Better detection for end of sharedSiteId queue #16769

core:archive Better detection for end of sharedSiteId queue #16769

tsteur commented Nov 22, 2020

diosmosis commented Nov 23, 2020

tsteur commented Nov 23, 2020

diosmosis Nov 24, 2020

tsteur Nov 24, 2020

diosmosis Nov 24, 2020

tsteur Nov 24, 2020

diosmosis Nov 24, 2020

diosmosis Nov 24, 2020

tsteur Nov 24, 2020

diosmosis Nov 24, 2020

diosmosis left a comment

diosmosis commented Dec 14, 2020

core:archive Better detection for end of sharedSiteId queue #16769

core:archive Better detection for end of sharedSiteId queue #16769

Conversation

tsteur commented Nov 22, 2020

Description:

Review

diosmosis commented Nov 23, 2020

tsteur commented Nov 23, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

diosmosis left a comment

Choose a reason for hiding this comment

diosmosis commented Dec 14, 2020