@mattab opened this Issue on February 27th 2014 Member

The goal of this ticket is to discuss how we could improve the speed and efficiency of the custom date range report aggregation. Currently, archiving custom date ranges is slow.

@mattab commented on September 14th 2014 Member

Example today's email to feedback@piwik.org: on the All Websites dashboard when default URL is module=MultiSites&action=index&idSite=2&period=range&date=last30 then user experiences slowness because the data is not pre-process and wil be processed . User said takes forever to load, only 10 websites. Almost faster to open each one... and this is one of many.

Update dec 2014: was fixed in #6672

@mattab commented on March 12th 2015 Member

@tsteur this request for example is slow: http://demo.piwik.org/index.php?module=CoreHome&action=index&date=2013-03-05,2015-03-11&period=range&idSite=1#/module=Actions&action=menuGetPageUrls&date=2013-03-05,2015-03-11&period=range&idSite=1 - it took about 40 seconds to archive. Maybe we could make this kind of large date range much faster?

@tsteur commented on March 12th 2015 Member

This request takes 2.3 seconds when I request it... and it should be even faster once #7409 is merged

@tsteur commented on March 16th 2015 Member

One thing that I noticed and took me a while to figure out was that, if someone actually uses range dates, one should disable browser archiving. Otherwise it will always re-archive the last day, week, month or year depending on the range. We might have to do this automatically (disable browser archiving for some subperiods if range is used and an archive already exists)

Note: Once we do pre-archive range dates this can become a problem as it would always pre-archive the last year / month / week / day as it will be always authorized to archive

@tsteur commented on March 19th 2015 Member

A lot of improvements were made here. We have to decide next week how we want to continue with this problem. It might make sense to make further improvements when working on #7470 (refactoring the Archiver). One idea was for example to build the range only the requested record. This is not easy to add to the current implementation of the archiver but would bring quite a bit of improvement.

Another idea could be to sometimes substract range dates. Eg if today is 2015-03-19 and one fetches 2014-12-20,yesterday (yesterday will be very often the case) we could fetch the year of 2015 and substract the 2015-03-19 archive. Same if we have 10 months. Instead of fetching 10 monthly archives, we could fetch 1 yearly archive and substract 2 monthly archives. This is quite hard to implement though.

Easiest to implement and probably the fastest solution as well would be to only fetch the requested recordName and only the requested 1st level table or only the requested subtable. I tested it and it is very fast and easy to implement. Problem is it does not work with subtableIds. We'd have to use labels as subtables as we can generate the subtableIds only if we build the expanded table. Building the expanded table is expensive (in terms of needed time) again.

@mattab commented on March 30th 2015 Member

Created follow up issues: (in order of importance)

  • When requesting Date Range or Custom Segment, only archive the requested record #7573
  • Date Range archiving: Only archive report for the requested idSubtable #7575
  • Make Date Range use the optimal number of periods by substracting periods #7574
This Issue was closed on March 19th 2015
Powered by GitHub Issue Mirror