@mgazdzik opened this Issue on November 11th 2014 Contributor

Current archiving flow can bring certaing ammount of problems when archiving segments on instances which are 2-3-4 years old. During normal flow of cron archiving, there will always be only last 2 years processed. Adding new segment(s) can bring up two following problems for archive process:

  • if at any time archiving will fall back to computing last3, last4 (anything bigger than last2) for year period - it can cause processing of days and months for that 3rd year. This will cause huge increase of archiving time. In addition, the more segments exist on instance, the more additional computing will have to be done to complete last3 archiving.
  • On big traffic instances adding new segments can also be troublesome, because Piwik would try to process last 2 days, weeks, months and years. Given a batch of 50 segments, such 'catching up' will take significantly bigger ammount of time. Workaround for this is to add only couple segments at one time, but this can be troublesome when having many Piwik admins.

The goal of this ticket is to decide best approarch to this issue and hopefully plan implementation for improvement

@mgazdzik commented on November 11th 2014 Contributor

One idea to handle thosse issues is:

  • define single unit of new segments processing (for ex. only one month back including its days and weeks, all the way back up till ts_created of website). Having that done, we can set config variable to only process certain ammount of units for single segment in single archiving run. That way we can limit ammount of computation at single archive run and split whole process into more lighter runs and keep providing current statistics at the same time.
@mattab commented on November 15th 2014 Member

Thanks for the suggestion! we'll investigate this in 2.11.0 sprint, once we have #5363

@RMastop commented on December 1st 2014 Contributor

Segment does not necessarily need to be archived since the beginning of time. I would suggest adding an extra set of properties: start-date and end-date of the segment. This way the archiving would not need to calculate all historic data.

@mattab commented on March 9th 2015 Member

First step to solve this issue will be #7223

If that issue does not solve the problem all together, let's discuss again what solution we could put in place :+1:

This Issue was closed on March 9th 2015
Powered by GitHub Issue Mirror