@tsteur opened this Issue on December 15th 2020 Member

@diosmosis just looking at some DB performance insights where a lot of new options are showing up in the top queries.

Then looked where they are used and wasn't really understanding what they do and whether they even need to be cached in an option.

Might be good to see if we can make some tweaks or remove some code and also maybe add some inline comments explaining what happens as I tried to understand it and wasn't sure what's happening there.

@diosmosis commented on December 17th 2020 Member
@diosmosis commented on December 17th 2020 Member

Also maybe the transient cache here is not needed as maybe the Option class itself caches it already?

It is, but this is done for the entire process and core:archive runs for a long time. The cache entry has a ttl if you look at where it's set: https://github.com/matomo-org/matomo/blob/4.x-dev/core/CronArchive.php#L1042

I'm not entirely sure where to document this.

@diosmosis commented on December 17th 2020 Member

Seeing also this min visit time for site option in https://github.com/matomo-org/matomo/blob/4.x-dev/core/ArchiveProcessor/Loader.php#L427-L442 . Not sure why this one is needed in https://github.com/matomo-org/matomo/blob/4.x-dev/core/ArchiveProcessor/Loader.php#L413-L422 and how that logic works?

Like we could just query the $this->rawLogDao->hasSiteVisitsBetweenTimeframe directly to see if there's been any visit during the archive date range?

This was added a while ago, it was part of an optimization, but I can't remember the exact reason for it... The time it takes to run ArchiveCronTest doesn't change w/ this code there/not-there, so I'll remove it.

@diosmosis commented on December 17th 2020 Member

@tsteur noticing now the last invalidation time needs to be per site here... or perhaps we need to use the last core archive finish time now (or start time?), since for a global segment we'd need to invalidate for every site, but that wouldn't happen if we update the last invalidation time like this... actually it might if we do it per site...

@tsteur commented on December 18th 2020 Member

@diosmosis not sure I understand https://github.com/matomo-org/developer-documentation/pull/402/files#diff-89a35a798a82b03c6fd7dbc734fb0494f586e285c2cb55e3785d50b350061d21R138-R139

wouldn't we simply invalidate back a few months (depending on config) since segment creation / update date?

Last invalidation time would be usually today or yesterday?

And in the code in https://github.com/matomo-org/matomo/blob/4.x-dev/core/CronArchive/SegmentArchiving.php#L157-L171 it seems to be responsible to simply skip invalidating when eg we last archived today and segment was last edited yesterday.

I wonder if this could even cause issues when we skip archiving data for segments for today and meanwhile lastInvalidationTime is today and then it would skip it.

wouldn't we still always want to check the processNewSegmentsFrom setting in https://github.com/matomo-org/matomo/blob/4.x-dev/core/CronArchive/SegmentArchiving.php#L172-L215 ?

Sorry I'm still not quite understanding this one.

@diosmosis commented on December 18th 2020 Member

wouldn't we simply invalidate back a few months (depending on config) since segment creation / update date?

We need be able to detect when the segment is new/updated, and of course we don't want to end up invalidating in the past every time we encounter the segment.

IIRC this is what we did before the refactor.

I wonder if this could even cause issues when we skip archiving data for segments for today and meanwhile lastInvalidationTime is today and then it would skip it.

This could be a problem. As said above, we need to do the invalidation once per create/update, and we need to know when we've done it. Currently it's "has the segment been created/changed after the last time we invalidated? if yes, then we haven't invalidated in the past for it and need to".

wouldn't we still always want to check the processNewSegmentsFrom setting in https://github.com/matomo-org/matomo/blob/4.x-dev/core/CronArchive/SegmentArchiving.php#L172-L215 ?

This is still checked? This is how far back we go, it's unrelated to checking whether we've done it before.

@tsteur commented on December 18th 2020 Member

This could be a problem. As said above, we need to do the invalidation once per create/update, and we need to know when we've done it. Currently it's "has the segment been created/changed after the last time we invalidated? if yes, then we haven't invalidated in the past for it and need to".

Now I get it 👍 can we maybe similar to custom reports etc invalidate when segment is created or updated? Maybe this way we can get rid of all this and it would always work independent of any race condition and what segments have been invalidated etc

@diosmosis commented on December 18th 2020 Member

Now I get it 👍 can we maybe similar to custom reports etc invalidate when segment is created or updated?

This is a MUCH better idea. I'll make a PR.

@tsteur commented on December 18th 2020 Member

👍

@tsteur commented on August 20th 2021 Member

@diosmosis I reckon this one we can close?

@diosmosis commented on August 20th 2021 Member

Yes, will close. Don't think I wrote everything down, but it should be enough to get someone started on the system.

This Issue was closed on August 20th 2021
Powered by GitHub Issue Mirror