@tsteur opened this Pull Request on May 27th 2020 Member

@diosmosis Noticed an error that today wasn't being archived while testing again https://github.com/matomo-org/matomo/pull/15991 . Eg I had set config to:

enable_general_settings_admin=0
time_before_today_archive_considered_outdated = 1

Timezone of the site: America/Los_Angeles

I would have expected that the archive is being invalidated every time I execute the console archive command but this wasn't happening. Then noticed $minDatetimeArchiveProcessedUTC was already a date and therefore somehow converted again resulting in a different time.

See screenshot where the archive should have been ignored because ts_archived is older but it wasn't ignored.
image

See what happened to the date after passing it again to date::factory:
image

Looks actually a bit like a bug to me that Date::factory would set time to 00:00:00.

Another potential bug maybe @diosmosis ... the last line in that method returns return [$idArchive, $visits, $visitsConverted, true, $tsArchived];. Should the ignore parameter really be set to true?

@tsteur commented on May 27th 2020 Member

@diosmosis I think there's also still another bug. Maybe around https://github.com/matomo-org/matomo/blob/todayarchive/core/CronArchive.php#L876-L880

When I set the TTL for archives to 1 second. Then I would expect also that this week, this month and this year is being archived every time I execute core:archive but it looks like it doesn't do this. Eg here it only launches day https://github.com/matomo-org/matomo/blob/todayarchive/core/CronArchive.php#L960

@diosmosis commented on May 27th 2020 Member

Another potential bug maybe @diosmosis ... the last line in that method returns return [$idArchive, $visits, $visitsConverted, true, $tsArchived];. Should the ignore parameter really be set to true?

I'm not sure what you mean by "ignore parameter", that value is whether we found any archive at all for the period/site/segment/plugin. It's used in Loader.php to avoid invalidating if there's no archive at all.

@diosmosis commented on May 27th 2020 Member

When I set the TTL for archives to 1 second. Then I would expect also that this week, this month and this year is being archived every time I execute core:archive but it looks like it doesn't do this. Eg here it only launches day https://github.com/matomo-org/matomo/blob/todayarchive/core/CronArchive.php#L960

This would mean that the archive invalidation isn't invalidating week/month/year as well. Are you able to check if after that code is run, there are entries for those in archive_invalidations? And that the archives are set to DONE_INVALIDATED? If there are, then there's an issue in core:archive selecting the wrong invalidations.

@tsteur commented on May 27th 2020 Member

I'm not sure what you mean by "ignore parameter", that value is whether we found any archive at all for the period/site/segment/plugin. It's used in Loader.php to avoid invalidating if there's no archive at all.

Ignore this. I was reading it wrong.

There is no entry afterwards in the invalidations table and this is what I got as output:

INFO [2020-05-27 21:25:18] 75393  START
INFO [2020-05-27 21:25:18] 75393  Starting Matomo reports archiving...
INFO [2020-05-27 21:25:18] 75393  Checking for queued invalidations...
INFO [2020-05-27 21:25:18] 75393    Will invalidate archived reports for today in site ID = 1's timezone (2020-05-27 00:00:00).
INFO [2020-05-27 21:25:18] 75393    Will invalidate archived reports for yesterday in site ID = 4's timezone (2020-05-27 00:00:00).
INFO [2020-05-27 21:25:18] 75393  Done invalidating
@diosmosis commented on May 27th 2020 Member

There is no entry afterwards in the invalidations table and this is what I got as output:

Model::updateArchiveAsInvalidated() is what adds the invalidations. If the message is in the output, then ArchiveInvalidator is getting invoked, but nothing's getting inserted for some reason. (Assuming you're stopping core:archive immediately after invalidation.) I can try to reproduce this later and see what's going on.

@tsteur commented on May 27th 2020 Member

Be good if you could have a look. I recommend above mentioned ini settings as then we'd expect today, this week, this month, this year to be always invalidated

@diosmosis commented on May 28th 2020 Member

@tsteur not able to reproduce it. If I exit after Done invalidating, I see archive invalidations for today & yesterday (including some duplicates, wonder if I can get rid of those efficiently).

@tsteur commented on May 28th 2020 Member

Don't know why (eg whether it is maybe a timezone issue) but now seeing these invalidations when exiting after the invalidations:

image

However, I do then a full archive using ./console core:archive where all sites are being archived. Then I do another tracking request again just to make sure there is data.

Then I trigger core:archive again and I do not see the invalidations again. Because I set the TTL to 1 second, I would have expected these to be invalidated again.

Can you maybe try to reproduce it that way?

@diosmosis commented on May 28th 2020 Member

I see, I'll try w/ a 1s ttl.

@diosmosis commented on May 29th 2020 Member

@tsteur can't reproduce still. Can you run the command w/ -vvv?

@tsteur commented on June 1st 2020 Member

@diosmosis it thinks there is no new visit even though there is... will do a quick debug

DEBUG [2020-06-01 21:09:56] 56610   UserSynchronizer::makeConfigured(): LDAP access synchronization not enabled.
DEBUG [2020-06-01 21:09:56] 56610   UserSynchronizer::makeConfigured: configuring with defaultSitesWithViewAccess = 
DEBUG [2020-06-01 21:09:56] 56610   UserSynchronizer::makeConfigured(): LDAP access synchronization not enabled.
DEBUG [2020-06-01 21:09:56] 56610   UserSynchronizer::makeConfigured: configuring with defaultSitesWithViewAccess = 
INFO [2020-06-01 21:09:56] 56610   ---------------------------
INFO [2020-06-01 21:09:56] 56610   INIT
INFO [2020-06-01 21:09:56] 56610   Running Matomo 4.0.0-b2 as Super User
INFO [2020-06-01 21:09:56] 56610   ---------------------------
INFO [2020-06-01 21:09:56] 56610   NOTES
INFO [2020-06-01 21:09:56] 56610   - If you execute this script at least once per hour (or more often) in a crontab, you may disable 'Browser trigger archiving' in Matomo UI > Settings > General Settings.
INFO [2020-06-01 21:09:56] 56610     See the doc at: https://matomo.org/docs/setup-auto-archiving/
INFO [2020-06-01 21:09:57] 56610   - Async process archiving supported, using CliMulti.
INFO [2020-06-01 21:09:57] 56610   - Reports for today will be processed at most every 1 seconds. You can change this value in Matomo UI > Settings > General Settings.
INFO [2020-06-01 21:09:57] 56610   - Archiving was last executed without error 1 min 0s ago
INFO [2020-06-01 21:09:57] 56610   ---------------------------
INFO [2020-06-01 21:09:57] 56610   START
INFO [2020-06-01 21:09:57] 56610   Starting Matomo reports archiving...
INFO [2020-06-01 21:09:57] 56610   Checking for queued invalidations...
INFO [2020-06-01 21:09:57] 56610     Will invalidate archived reports for today in site ID = 1's timezone (2020-06-01 00:00:00).
DEBUG [2020-06-01 21:09:57] 56610     Today archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Today archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Today archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Today archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Today archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Today archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Yesterday archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Yesterday archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Yesterday archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Yesterday archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Yesterday archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Yesterday archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Yesterday archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610   Earliest created time of segment 'visitIp>=15.15.15.15;visitIp<=20.20.20.20,visitIp==21.21.21.21,visitIp==22.22.22.22,actionUrl=<a class='mention' href='https://github.com/foo'>@foo</a>' w/ idSite = 1 is found to be 2019-02-04. Latest edit time is found to be 2019-02-04.
DEBUG [2020-06-01 21:09:57] 56610   Earliest created time of segment 'actions>=2' w/ idSite = 7 is found to be 2020-03-30. Latest edit time is found to be 2020-03-30.
INFO [2020-06-01 21:09:57] 56610   Done invalidating
@tsteur commented on June 1st 2020 Member

Actually it works now also for me. And it correctly detected it had visits for idSite=1 it was just not mentioned because the output starts with idSite1 and I thought all messages below belonged to idSite 1 but they were for other idSites (might need to add this to the debug messages)

INFO [2020-06-01 21:09:57] 56610     Will invalidate archived reports for today in site ID = 1's timezone (2020-06-01 00:00:00).
DEBUG [2020-06-01 21:09:57] 56610     Today archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Today archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Today archive can be skipped due to no visits, skipping invalidation...

So seems to work 👍

@tsteur commented on June 1st 2020 Member

fyi i think what happened was that something removed time_before_today_archive_considered_outdated = 1 from my config file and it was probably using a different timeout ... sorry about that

@diosmosis commented on June 2nd 2020 Member

👍 the output could probably be improved. I'm currently working on this in another PR.

@tsteur commented on June 2nd 2020 Member

👍 I guess we can merge this then?

This Pull Request was closed on June 2nd 2020
Powered by GitHub Issue Mirror