Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix today archive is not being archived #15996

Merged
merged 2 commits into from Jun 2, 2020
Merged

Fix today archive is not being archived #15996

merged 2 commits into from Jun 2, 2020

Conversation

tsteur
Copy link
Member

@tsteur tsteur commented May 27, 2020

@diosmosis Noticed an error that today wasn't being archived while testing again #15991 . Eg I had set config to:

enable_general_settings_admin=0
time_before_today_archive_considered_outdated = 1

Timezone of the site: America/Los_Angeles

I would have expected that the archive is being invalidated every time I execute the console archive command but this wasn't happening. Then noticed $minDatetimeArchiveProcessedUTC was already a date and therefore somehow converted again resulting in a different time.

See screenshot where the archive should have been ignored because ts_archived is older but it wasn't ignored.
image

See what happened to the date after passing it again to date::factory:
image

Looks actually a bit like a bug to me that Date::factory would set time to 00:00:00.

Another potential bug maybe @diosmosis ... the last line in that method returns return [$idArchive, $visits, $visitsConverted, true, $tsArchived];. Should the ignore parameter really be set to true?

Noticed an error that today wasn't being archived. Eg I had set config to:

```ini
enable_general_settings_admin=0
time_before_today_archive_considered_outdated = 1
```

I would have expected that the archive is being invalidated every time I execute the console archive command but this wasn't happening. Then noticed `$minDatetimeArchiveProcessedUTC` was already a date and therefore somehow converted again resulting in a different time.
@tsteur tsteur added not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org. Needs Review PRs that need a code review labels May 27, 2020
@tsteur tsteur added this to the 4.0.0 milestone May 27, 2020
@tsteur
Copy link
Member Author

tsteur commented May 27, 2020

@diosmosis I think there's also still another bug. Maybe around https://github.com/matomo-org/matomo/blob/todayarchive/core/CronArchive.php#L876-L880

When I set the TTL for archives to 1 second. Then I would expect also that this week, this month and this year is being archived every time I execute core:archive but it looks like it doesn't do this. Eg here it only launches day https://github.com/matomo-org/matomo/blob/todayarchive/core/CronArchive.php#L960

@@ -92,10 +92,14 @@ public static function getArchiveIdAndVisits(ArchiveProcessor\Parameters $params
return [false, $visits, $visitsConverted, true, $tsArchived];
}

if (!empty($minDatetimeArchiveProcessedUTC) && (!is_object($minDatetimeArchiveProcessedUTC) || !$minDatetimeArchiveProcessedUTC instanceof Date)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the var isn't an object then it will never be a Date, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could remove it. Guess in case it was some different class but that wouldn't work anyway

@diosmosis
Copy link
Member

Another potential bug maybe @diosmosis ... the last line in that method returns return [$idArchive, $visits, $visitsConverted, true, $tsArchived];. Should the ignore parameter really be set to true?

I'm not sure what you mean by "ignore parameter", that value is whether we found any archive at all for the period/site/segment/plugin. It's used in Loader.php to avoid invalidating if there's no archive at all.

@diosmosis
Copy link
Member

When I set the TTL for archives to 1 second. Then I would expect also that this week, this month and this year is being archived every time I execute core:archive but it looks like it doesn't do this. Eg here it only launches day https://github.com/matomo-org/matomo/blob/todayarchive/core/CronArchive.php#L960

This would mean that the archive invalidation isn't invalidating week/month/year as well. Are you able to check if after that code is run, there are entries for those in archive_invalidations? And that the archives are set to DONE_INVALIDATED? If there are, then there's an issue in core:archive selecting the wrong invalidations.

@tsteur
Copy link
Member Author

tsteur commented May 27, 2020

I'm not sure what you mean by "ignore parameter", that value is whether we found any archive at all for the period/site/segment/plugin. It's used in Loader.php to avoid invalidating if there's no archive at all.

Ignore this. I was reading it wrong.

There is no entry afterwards in the invalidations table and this is what I got as output:

INFO [2020-05-27 21:25:18] 75393  START
INFO [2020-05-27 21:25:18] 75393  Starting Matomo reports archiving...
INFO [2020-05-27 21:25:18] 75393  Checking for queued invalidations...
INFO [2020-05-27 21:25:18] 75393    Will invalidate archived reports for today in site ID = 1's timezone (2020-05-27 00:00:00).
INFO [2020-05-27 21:25:18] 75393    Will invalidate archived reports for yesterday in site ID = 4's timezone (2020-05-27 00:00:00).
INFO [2020-05-27 21:25:18] 75393  Done invalidating

@diosmosis
Copy link
Member

There is no entry afterwards in the invalidations table and this is what I got as output:

Model::updateArchiveAsInvalidated() is what adds the invalidations. If the message is in the output, then ArchiveInvalidator is getting invoked, but nothing's getting inserted for some reason. (Assuming you're stopping core:archive immediately after invalidation.) I can try to reproduce this later and see what's going on.

@tsteur
Copy link
Member Author

tsteur commented May 27, 2020

Be good if you could have a look. I recommend above mentioned ini settings as then we'd expect today, this week, this month, this year to be always invalidated

@diosmosis
Copy link
Member

@tsteur not able to reproduce it. If I exit after Done invalidating, I see archive invalidations for today & yesterday (including some duplicates, wonder if I can get rid of those efficiently).

@tsteur
Copy link
Member Author

tsteur commented May 28, 2020

Don't know why (eg whether it is maybe a timezone issue) but now seeing these invalidations when exiting after the invalidations:

image

However, I do then a full archive using ./console core:archive where all sites are being archived. Then I do another tracking request again just to make sure there is data.

Then I trigger core:archive again and I do not see the invalidations again. Because I set the TTL to 1 second, I would have expected these to be invalidated again.

Can you maybe try to reproduce it that way?

@diosmosis
Copy link
Member

I see, I'll try w/ a 1s ttl.

@diosmosis
Copy link
Member

@tsteur can't reproduce still. Can you run the command w/ -vvv?

@tsteur
Copy link
Member Author

tsteur commented Jun 1, 2020

@diosmosis it thinks there is no new visit even though there is... will do a quick debug

DEBUG [2020-06-01 21:09:56] 56610   UserSynchronizer::makeConfigured(): LDAP access synchronization not enabled.
DEBUG [2020-06-01 21:09:56] 56610   UserSynchronizer::makeConfigured: configuring with defaultSitesWithViewAccess = 
DEBUG [2020-06-01 21:09:56] 56610   UserSynchronizer::makeConfigured(): LDAP access synchronization not enabled.
DEBUG [2020-06-01 21:09:56] 56610   UserSynchronizer::makeConfigured: configuring with defaultSitesWithViewAccess = 
INFO [2020-06-01 21:09:56] 56610   ---------------------------
INFO [2020-06-01 21:09:56] 56610   INIT
INFO [2020-06-01 21:09:56] 56610   Running Matomo 4.0.0-b2 as Super User
INFO [2020-06-01 21:09:56] 56610   ---------------------------
INFO [2020-06-01 21:09:56] 56610   NOTES
INFO [2020-06-01 21:09:56] 56610   - If you execute this script at least once per hour (or more often) in a crontab, you may disable 'Browser trigger archiving' in Matomo UI > Settings > General Settings.
INFO [2020-06-01 21:09:56] 56610     See the doc at: https://matomo.org/docs/setup-auto-archiving/
INFO [2020-06-01 21:09:57] 56610   - Async process archiving supported, using CliMulti.
INFO [2020-06-01 21:09:57] 56610   - Reports for today will be processed at most every 1 seconds. You can change this value in Matomo UI > Settings > General Settings.
INFO [2020-06-01 21:09:57] 56610   - Archiving was last executed without error 1 min 0s ago
INFO [2020-06-01 21:09:57] 56610   ---------------------------
INFO [2020-06-01 21:09:57] 56610   START
INFO [2020-06-01 21:09:57] 56610   Starting Matomo reports archiving...
INFO [2020-06-01 21:09:57] 56610   Checking for queued invalidations...
INFO [2020-06-01 21:09:57] 56610     Will invalidate archived reports for today in site ID = 1's timezone (2020-06-01 00:00:00).
DEBUG [2020-06-01 21:09:57] 56610     Today archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Today archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Today archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Today archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Today archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Today archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Yesterday archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Yesterday archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Yesterday archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Yesterday archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Yesterday archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Yesterday archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Yesterday archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610   Earliest created time of segment 'visitIp>=15.15.15.15;visitIp<=20.20.20.20,visitIp==21.21.21.21,visitIp==22.22.22.22,actionUrl=@foo' w/ idSite = 1 is found to be 2019-02-04. Latest edit time is found to be 2019-02-04.
DEBUG [2020-06-01 21:09:57] 56610   Earliest created time of segment 'actions>=2' w/ idSite = 7 is found to be 2020-03-30. Latest edit time is found to be 2020-03-30.
INFO [2020-06-01 21:09:57] 56610   Done invalidating

@tsteur
Copy link
Member Author

tsteur commented Jun 1, 2020

Actually it works now also for me. And it correctly detected it had visits for idSite=1 it was just not mentioned because the output starts with idSite1 and I thought all messages below belonged to idSite 1 but they were for other idSites (might need to add this to the debug messages)

INFO [2020-06-01 21:09:57] 56610     Will invalidate archived reports for today in site ID = 1's timezone (2020-06-01 00:00:00).
DEBUG [2020-06-01 21:09:57] 56610     Today archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Today archive can be skipped due to no visits, skipping invalidation...
DEBUG [2020-06-01 21:09:57] 56610     Today archive can be skipped due to no visits, skipping invalidation...

So seems to work 👍

@tsteur
Copy link
Member Author

tsteur commented Jun 1, 2020

fyi i think what happened was that something removed time_before_today_archive_considered_outdated = 1 from my config file and it was probably using a different timeout ... sorry about that

@diosmosis
Copy link
Member

👍 the output could probably be improved. I'm currently working on this in another PR.

@tsteur
Copy link
Member Author

tsteur commented Jun 2, 2020

👍 I guess we can merge this then?

@diosmosis diosmosis merged commit 536bb7f into 4.x-dev Jun 2, 2020
@diosmosis diosmosis deleted the todayarchive branch June 2, 2020 21:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Review PRs that need a code review not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants