@diosmosis opened this Pull Request on December 1st 2020 Member

Description:

Previously segments were procedurally archived after others, now we need to invalidate them for the QueueConsumer to pick them up.

Fixes #16842

Review

  • [ ] Functional review done
  • [ ] Usability review done (is anything maybe unclear or think about anything that would cause people to reach out to support)
  • [ ] Security review done see checklist
  • [ ] Code review done
  • [ ] Tests were added if useful/possible
  • [ ] Reviewed for breaking changes
  • [ ] Developer changelog updated if needed
  • [ ] Documentation added if needed
  • [ ] Existing documentation updated if needed
@tsteur commented on December 2nd 2020 Member

@diosmosis just tested it. I think there's one issue for it.

If the archive TTL for today is set to for example 900s.

And if it has found an archive for the All Visits in the last 900s, then it won't start the segment archiving by the looks. It should respect the 900s TTL for every individual archive.

Not sure if it's clear what I mean? I haven't checked but I think you can reproduce this by having eg a TTL of 900s.

Then call ./console core:archive --force-idsites=1 --skip-all-segments. It should archive the "no segment" archive.

If you then call it with ./console core:archive --force-idsites=1 it seems to not trigger the segments. At least something like this is what I experienced.

@tsteur commented on December 2nd 2020 Member

@diosmosis I just tested it but for some reason it's now not even triggering the main archive. Even though the last archive is older than the TTL of 30s and there is a recent tracking request

$ ./console core:archive --force-idsites=1 --skip-all-segments -vvv
INFO [2020-12-02 21:18:15] 88332  ---------------------------
INFO [2020-12-02 21:18:15] 88332  INIT
INFO [2020-12-02 21:18:15] 88332  Running Matomo 4.0.3 as Super User
INFO [2020-12-02 21:18:15] 88332  ---------------------------
INFO [2020-12-02 21:18:15] 88332  NOTES
INFO [2020-12-02 21:18:16] 88332  - Async process archiving supported, using CliMulti.
INFO [2020-12-02 21:18:16] 88332  - Reports for today will be processed at most every 30 seconds. You can change this value in Matomo UI > Settings > General Settings.
INFO [2020-12-02 21:18:16] 88332  - Archiving was last executed without error 1 min 18s ago
INFO [2020-12-02 21:18:16] 88332  - The following websites do not use the tracker: 
INFO [2020-12-02 21:18:16] 88332  - Will process 1 websites (--force-idsites)
INFO [2020-12-02 21:18:16] 88332  - Will process specified sites: 1
INFO [2020-12-02 21:18:16] 88332  ---------------------------
INFO [2020-12-02 21:18:16] 88332  START
INFO [2020-12-02 21:18:16] 88332  Starting Matomo reports archiving...
DEBUG [2020-12-02 21:18:16] 88332  Applying queued rearchiving...
INFO [2020-12-02 21:18:16] 88332  Start processing archives for site 1.
DEBUG [2020-12-02 21:18:16] 88332  Checking for queued invalidations...
INFO [2020-12-02 21:18:16] 88332    Will invalidate archived reports for today in site ID = 1's timezone (2020-12-02 00:00:00).
DEBUG [2020-12-02 21:18:16] 88332    Found usable archive for [idSite = 1, period = day 2020-12-02,2020-12-02, segment = ], skipping invalidation.
DEBUG [2020-12-02 21:18:16] 88332  General tracker cache was re-created.
DEBUG [2020-12-02 21:18:16] 88332    Found usable archive for [idSite = 1, period = day 2020-12-02,2020-12-02, segment = visitIp>=15.15.15.15;visitIp<=20.20.20.20,visitIp==21.21.21.21,visitIp==22.22.22.22,actionUrl=<a class='mention' href='https://github.com/foo'>@foo</a>], skipping invalidation.
DEBUG [2020-12-02 21:18:16] 88332    Found usable archive for [idSite = 1, period = day 2020-12-02,2020-12-02, segment = userId<1], skipping invalidation.
DEBUG [2020-12-02 21:18:16] 88332    Found usable archive for [idSite = 1, period = day 2020-12-02,2020-12-02, segment = userId<1], skipping invalidation.
DEBUG [2020-12-02 21:18:16] 88332    Found usable archive for [idSite = 1, period = day 2020-12-02,2020-12-02, segment = userId<1], skipping invalidation.
DEBUG [2020-12-02 21:18:16] 88332    Found usable archive for [idSite = 1, period = day 2020-12-02,2020-12-02, segment = userId<1], skipping invalidation.
DEBUG [2020-12-02 21:18:16] 88332    Found usable archive for [idSite = 1, period = day 2020-12-02,2020-12-02, segment = userId<1], skipping invalidation.
DEBUG [2020-12-02 21:18:16] 88332    Yesterday archive can be skipped due to no visits for idSite = 1, skipping invalidation...
DEBUG [2020-12-02 21:18:16] 88332  Earliest created time of segment 'visitIp>=15.15.15.15;visitIp<=20.20.20.20,visitIp==21.21.21.21,visitIp==22.22.22.22,actionUrl=<a class='mention' href='https://github.com/foo'>@foo</a>' w/ idSite = 1 is found to be 2019-02-04. Latest edit time is found to be 2019-02-04.
DEBUG [2020-12-02 21:18:16] 88332  Earliest created time of segment 'userId<1' w/ idSite = 1 is found to be 2020-09-23. Latest edit time is found to be 2020-09-23.
DEBUG [2020-12-02 21:18:16] 88332  Earliest created time of segment 'userId<1' w/ idSite = 1 is found to be 2020-09-23. Latest edit time is found to be 2020-09-23.
DEBUG [2020-12-02 21:18:16] 88332  Earliest created time of segment 'userId<1' w/ idSite = 1 is found to be 2020-09-23. Latest edit time is found to be 2020-09-23.
DEBUG [2020-12-02 21:18:16] 88332  Earliest created time of segment 'userId<1' w/ idSite = 1 is found to be 2020-09-23. Latest edit time is found to be 2020-09-23.
DEBUG [2020-12-02 21:18:16] 88332  Earliest created time of segment 'userId<1' w/ idSite = 1 is found to be 2020-09-23. Latest edit time is found to be 2020-09-23.
DEBUG [2020-12-02 21:18:16] 88332  Done invalidating
DEBUG [2020-12-02 21:18:16] 88332  No next invalidated archive.
INFO [2020-12-02 21:18:16] 88332  Finished archiving for site 1, 0 API requests, Time elapsed: 0.302s [1 / 1 done]
DEBUG [2020-12-02 21:18:16] 88332  No more sites left to archive, stopping.
INFO [2020-12-02 21:18:16] 88332  Done archiving!
INFO [2020-12-02 21:18:16] 88332  ---------------------------
INFO [2020-12-02 21:18:16] 88332  SUMMARY
INFO [2020-12-02 21:18:16] 88332  Processed 0 archives.
INFO [2020-12-02 21:18:16] 88332  Total API requests: 0
INFO [2020-12-02 21:18:16] 88332  done: 0 req, 479 ms, no error
INFO [2020-12-02 21:18:16] 88332  Time elapsed: 0.479s

It might be unrelated to this issue though? The site's timezone is "auckland" and there is definitely a very recent visit

image

@tsteur commented on December 2nd 2020 Member

I'll try to have a quick debug.

I would have actually expected to have an entry for 2020-12-03 and not

DEBUG [2020-12-02 21:18:16] 88332 Found usable archive for [idSite = 1, period = day 2020-12-02,2020-12-02, segment = ], skipping invalidation.

@tsteur commented on December 2nd 2020 Member

Invalidation table itself is empty

@tsteur commented on December 2nd 2020 Member

@diosmosis I think the method invalidateRecentDate is basically not applying the site's timezone. And neither in the period factory in invalidateWithSegments?

@tsteur commented on December 2nd 2020 Member

I'm not sure what's wrong but for today it should have tried 2020-12-03 and not 2020-12-02.

@diosmosis commented on December 3rd 2020 Member

@tsteur updated for timezone issue

@tsteur commented on December 3rd 2020 Member

Works now @diosmosis Thanks! 👍 Was wondering if we can maybe write a test that for this recent change https://github.com/matomo-org/matomo/pull/16845/commits/52e4ea3575fa1ce7af82725edfd4fec561c898bc#diff-f41c275e883996b8b69cb765b1818e12e82cbe034e648bc3f7d03610cc7abad1R843 that it uses the right date but I suppose it might be bit difficult. Might work though with a mock and checking what it passes to invalidateWithSegments method. Feel free to merge otherwise

@diosmosis commented on December 3rd 2020 Member

@tsteur added a test and fixed existing tests

@tsteur commented on December 3rd 2020 Member

@diosmosis LGTM. There still seems to be a failing test though?

This Pull Request was closed on December 3rd 2020
Powered by GitHub Issue Mirror