Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lastRunKey in Archiver is not reliable #15127

Closed
tsteur opened this issue Nov 6, 2019 · 2 comments
Closed

lastRunKey in Archiver is not reliable #15127

tsteur opened this issue Nov 6, 2019 · 2 comments
Labels
invalid For issues or pull requests that are no longer relevant to Matomo core. Task Indicates an issue is neither a feature nor a bug and it's purely a "technical" change.
Milestone

Comments

@tsteur
Copy link
Member

tsteur commented Nov 6, 2019

We are using the lastRunKey https://github.com/matomo-org/matomo/blob/3.12.0/core/CronArchive.php#L280-L284 to determine when an archiver was last run for day or period.

We basically:

  1. $success = trigger archiving for day/periods
  2. if ($success) { $lastRunKey=time(); }

On the next archiver run we create the date parameter when archiving based on lastRunKey=time(). However, the problem is that in step 1 potentially not all periods were actually archived. Many could have been skipped because we are for example already archiving a different period (when we want to archive for example month, but a week period is currently being processed, than we skip month but AFAIK still return $success = true).

This could lead to randomly some periods being skipped in some edge conditions. Like it might use &period=day&date=last2 when it was actually last finished correctly only 3 days ago and then maybe not fully process 3 days ago.

I haven't actually debugged this, but seeing the code there I very much would expect this behaviour and this was introduced by recent changes we added this/last year where we started skipping some periods for example if another period is in progress. We might skip entire segments etc.

From what I see we already return $success= false here https://github.com/matomo-org/matomo/blob/3.12.0/core/CronArchive.php#L1006-L1008 which should prevent the lastRunKey from being written, but this seems not yet implemented for long running segment archives when dynamically aborting a request shortly before it is supposed to be executed in https://github.com/matomo-org/matomo/blob/3.12.0/core/CronArchive.php#L1013-L1017

This code is triggered when

  • Say you have 20 segments
  • Each segment takes 1 hour to archive
  • We archive max 1 segment at once for example (or 2 or 3 ...)
  • This means we execute segment 1
  • 1 hour later we want to launch the next segment archive
  • We need to check if some other period is in progress/segment and shortly before we launch that segment abort that request... currently we would still return $success=true AFAIK when it should be $success=false

A possible problem is that the more segments you have for example, the more likely it is that at some point a segment or period will be skipped, therefore $success = false and therefore on every archive run we do not update the lastRunKey leading eventually to things like last100 or last200 making archiving very slow.

I reckon best be to remove all logic around lastRunKey ideally but not sure what better logic to use and what is feasible.

This is also related to https://github.com/matomo-org/matomo/pull/14937/files#r343304194 and #15117

instead of detecting one lastProcessedTime for all periods and dates, ideally we would detect the lastProcessedTime on demand shortly before starting the archiving of a period/segment... by default we would use last3 or so (or whatever the default is) and if any periods were invalidated since then, we would need to adjust behaviour and use better date or launch multiple archive queries. Instead of last52 we could eg use last3 and then another query where we specifically archive the invalidated date.

I know I'm not explaining it well, it's a difficult topic. Happy to provide more insights.

@tsteur
Copy link
Member Author

tsteur commented May 27, 2020

@diosmosis if I see this right we're not even using the last run key anymore and with the new logic it's not needed anymore and can close this ticket?

@diosmosis
Copy link
Member

Yes, will close

@mattab mattab added the invalid For issues or pull requests that are no longer relevant to Matomo core. label Sep 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid For issues or pull requests that are no longer relevant to Matomo core. Task Indicates an issue is neither a feature nor a bug and it's purely a "technical" change.
Projects
None yet
Development

No branches or pull requests

4 participants