New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide more documentation on some archiving behaviour #16955
Comments
Added an answer to the first question here: https://github.com/matomo-org/developer-documentation/pull/402/files#diff-89a35a798a82b03c6fd7dbc734fb0494f586e285c2cb55e3785d50b350061d21R131 |
It is, but this is done for the entire process and core:archive runs for a long time. The cache entry has a ttl if you look at where it's set: https://github.com/matomo-org/matomo/blob/4.x-dev/core/CronArchive.php#L1042 I'm not entirely sure where to document this. |
This was added a while ago, it was part of an optimization, but I can't remember the exact reason for it... The time it takes to run ArchiveCronTest doesn't change w/ this code there/not-there, so I'll remove it. |
@tsteur noticing now the last invalidation time needs to be per site here... or perhaps we need to use the last core archive finish time now (or start time?), since for a global segment we'd need to invalidate for every site, but that wouldn't happen if we update the last invalidation time like this... actually it might if we do it per site... |
@diosmosis not sure I understand https://github.com/matomo-org/developer-documentation/pull/402/files#diff-89a35a798a82b03c6fd7dbc734fb0494f586e285c2cb55e3785d50b350061d21R138-R139 wouldn't we simply invalidate back a few months (depending on config) since segment creation / update date? Last invalidation time would be usually today or yesterday? And in the code in https://github.com/matomo-org/matomo/blob/4.x-dev/core/CronArchive/SegmentArchiving.php#L157-L171 it seems to be responsible to simply skip invalidating when eg we last archived today and segment was last edited yesterday. I wonder if this could even cause issues when we skip archiving data for segments for today and meanwhile wouldn't we still always want to check the Sorry I'm still not quite understanding this one. |
We need be able to detect when the segment is new/updated, and of course we don't want to end up invalidating in the past every time we encounter the segment. IIRC this is what we did before the refactor.
This could be a problem. As said above, we need to do the invalidation once per create/update, and we need to know when we've done it. Currently it's "has the segment been created/changed after the last time we invalidated? if yes, then we haven't invalidated in the past for it and need to".
This is still checked? This is how far back we go, it's unrelated to checking whether we've done it before. |
Now I get it 👍 can we maybe similar to custom reports etc invalidate when segment is created or updated? Maybe this way we can get rid of all this and it would always work independent of any race condition and what segments have been invalidated etc |
This is a MUCH better idea. I'll make a PR. |
👍 |
@diosmosis I reckon this one we can close? |
Yes, will close. Don't think I wrote everything down, but it should be enough to get someone started on the system. |
@diosmosis just looking at some DB performance insights where a lot of new options are showing up in the top queries.
Then looked where they are used and wasn't really understanding what they do and whether they even need to be cached in an option.
$this->rawLogDao->hasSiteVisitsBetweenTimeframe
directly to see if there's been any visit during the archive date range?hasSiteVisitsBetweenTimeframe
but not sure when that actually happens and whether we can't just executehasSiteVisitsBetweenTimeframe
directly? Eg because it's a lazy cache it causes even a lot of inserts in server file sync when caches need to be cleared across servers etc and overall it might cause more problems caching it in this case than fetching it from DB directly I wonder.Might be good to see if we can make some tweaks or remove some code and also maybe add some inline comments explaining what happens as I tried to understand it and wasn't sure what's happening there.
The text was updated successfully, but these errors were encountered: