New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to specify the max number of websites to process by archiver. #18326
Add option to specify the max number of websites to process by archiver. #18326
Conversation
This makes it possible to limit the process lifetime to avoid excessive memory usage (possibly due to PHP bugs such as 79519).
Looks like I can't add the "Needs Review" label. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EreMaijala Thank you for this contribution, it looks like a very useful option for installations with a large number of websites. It works well in the tests I've done and I can't see any issues. @sgiehl could you take a quick look in case I've overlooked anything?
@EreMaijala @bx80 I wonder if it would generally make more sense to stop after a certain amount of archives that were processed? This way it may work better (for example if one site has heaps of invalidations and therefore lots to archive which can happen eg when creating/changing a segment) and it would potentially also work better for other people with less sites but similar issue? Not sure if that would help or make any sense? |
@tsteur Stopping after a certain number of archives would probably be even better, but as the process currently works, the next execution seems to always pick the next site instead of next archive to process, and it would need much more extensive changes to make it work smoothly with archives. |
@tsteur To elaborate on the previous: since the queue contains site IDs and each archiver takes a site ID from the queue to process its archives, it's not easy to stop in the middle of a site. Perhaps it would be possible to put the site back into the queue, but that feels like a hack to me. |
@EreMaijala not sure if it helps, I was thinking something like this might work: diff --git a/core/CronArchive.php b/core/CronArchive.php
index bbdd8a691..a48e1098c 100644
--- a/core/CronArchive.php
+++ b/core/CronArchive.php
@@ -389,6 +389,13 @@ class CronArchive
$successCount = $this->launchArchivingFor($archivesToProcess, $queueConsumer);
$numArchivesFinished += $successCount;
+
+
+ if ($this->maxArchivesToProcess && $numArchivesFinished >= $this->maxArchivesToProcess) {
+ $this->logger->info("Maximum number of archives to process per execution has been reached.");
+ break;
+ }
+
} It's not the full code as |
@tsteur I believe that's what I initially tried, but that resulted in the situation I described, where the execution can stop in the middle of a site, and the next execution then starts from the next site. As far as I can see, archiver will detect the skipped ones after a day (with the message "Found X failed jobs (ts_invalidated older than 1 day), resetting status to try them again."), but that means that their processing will be delayed, and potentially skipped again if processing stops at the right moment. To illustrate this, I added some output and a hard-coded limit of 10 archives:
Results from first execution:
Results from second execution:
It starts from the next site (427) instead of finishing periods 3 and 4 for site 426. This is the reason I made it stop after processing all archives for a site. This said, it would be possible to do the check only when getNextArchivesToProcess returns an empty array, which indicates that it has completed archiving for the current website. That, however, would make the point where it stops inexact, more like 'Stop after processing at least X archives', which would make it not so different from max sites to process, but might be easier to use when there are lots of invalidations but only occasionally. I'll add that as an option, and you can then deside whether to drop either one. For us it doesn't make much difference as we can usually process something like 3000 invalidations before memory_limit is exceeded. |
@tsteur New option added. I think I'd prefer to keep both options since that'd allow specifying both and stop processing when either limit has been reached. I also considered naming the option something else than |
I reckon it wouldn't be a big issue since most of the time it's only launching like 3 or so in parallel I think.
Indeed. It would continue with the next sites, then once all the other sites have been worked on, continue again from the beginning and process the ones that haven't been processed yet. Depending how quick the archiving is, and how many sites there are, it would continue the aborted site in say 2 hours or maybe later. It should usually not be an issue but it depends on the situation. |
@tsteur Ah, I've missed that part and thought it will always take 24 hours for it to resume. It's not that bad then, sorry for the confusion. I'm still not convinced that it's not a problem at all as you could get a bit wonky results when looking at analytics for a site that's only partially archived, no? I mean that it's obviously not processed if the information is not available at all, but if you have like day and week but no month or year for several hours, it could be confusing. |
@EreMaijala it totally depends on the users of Matomo. For us it's not an issue and we're even calculating on our own Matomo month and year for example less often as having it up to date is not as import. Like if the data in the year is say 2-4 hours behind, out of up to 8800 hours, then it's too big of an issue. If you were to some up every week and day manually and then come to a different result maybe as the month period, then it can be confusing. |
@tsteur Fair enough. I've now updated this to make --max-archives-to-process a hard limit while keeping --max-websites-to-process as well. I believe this allows enough flexibility for most use cases. |
This all looks good to me and works well in testing, thanks @EreMaijala for adding both options. @tsteur Okay to merge? |
Yes, all good to merge. Technically we might want to create an FAQ for it. Maybe needs to be mentioned at least on https://matomo.org/docs/setup-auto-archiving/ |
I've updated the core:archive help output in the setup auto archiving user guide and also created a draft FAQ explaining how these options work and where they should be added. |
Thank you all! |
Description:
This change makes it possible to limit the process lifetime to avoid excessive memory usage (possibly due to PHP bugs such as 79519).
Rationale:
We have an issue with the archiver and a large number (~3900) of sites where the archiver process runs out of memory. We have tried to increase memory_limit up to 12 GB to no avail, so this is an attempt to limit the lifetime of the archiver to avoid this excessive memory usage.
As a sidenote there seem to be some static or long-living arrays that consume memory during archiving, such as the transient cache, but they don't seem to explain everything. I suspect PHP issues such as 79519 contribute to this.
Review