New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Archiver should skip creating a new "temporary" archive when there was no new visit/traffic since the last "temporary archive" #13387
Comments
when plugins force archiving in this way, they would have good reasons to do so, so their forcing would overwrite any logic we add around skipping websites/segments/etc |
@mattab I think the first bullet point actually solves that as well, doesn't it? Ie, if the archiver sends requests for segments, we'll see there are no visits and avoid archiving. Adding this logic to CronArchive.php while keeping some logic in PluginsArchiver.php for plugins that force archiving would be rather difficult I think. |
@mattab can you reply to my last comment? |
The first bullet point (core archiver) fixes part of the problem, which may be enough for now, but thought there would be a lot of improvements hidden in the other second bullet point (core:archive / CronArchive). for example imagine a Matomo with 1,000 sites and 10 global segments so 10,000 segments. If we only do the first bullet point, we still need to send 10,000 requests * 5 periods = 50K requests.. which would take a long time, possibly hours? what do you think (not sure if my numbers are correct)? |
The numbers seem right, but since we don't know if a plugin will force archiving, we can't really skip those requests... not w/o taking the forcing outside of the core archiving logic and putting it in CronArchive. I think that would be rather non-trivial. |
Maybe @tsteur has an idea or some thoughts? |
maybe it could be implemented only when Ideally otherwise we just set a flag for each site when starting or finished archiving like Can you otherwise mention some plugins that force archiving? could they listen/reuse the same event in |
@tsteur I think the event that needs to be respected is |
Posting it again may help 👍 Not sure for what we added this event? |
I don't think I was here for it, but I thought it was for something like SEO or importing stats from somewhere else? Where we'd want to archive, but not be dependent on visits in Matomo. |
No idea, I can't see it in any of our plugins... also can't find it in an issue or changelog |
@mattab / @tsteur implemented the change in the code and think again it might be a pointless addition. Looks like it's already in CronArchive.php: https://github.com/matomo-org/matomo/blob/3.x-dev/core/CronArchive.php#L1261-L1292 This code will check if there are visits between the last successful archiving time, which is pretty close to the latest archive for today. I think the only case it will save time is if archiving fails, but some archives get created. What do you think, still useful? You can see my changes here: 23b5a18 |
Refs #14639 |
yes 👍 |
Challenge: make archiving faster when there are hundreds of websites
Solution:
This should probably be implemented in two places.
from #5922 (comment)
The text was updated successfully, but these errors were encountered: