@tsteur opened this Pull Request on March 16th 2020 Member

Fixes a problem with roll up reporting when enable_processing_unique_visitors_multiple_sites = 1.

The problem is assuming idSite is a roll up (say idSite = 12 which has as children sites the idsites 2 and 4), and period is say week (same applies for month, year, range):

  • new LogAggregator(new Parameters($idSite, $period)).
  • LogAggregator::__construct
    • $this->idSites = Piwik::postEvent('ArchiveProcessor.Parameters.getIdSites', $idSite, $period). // meaning $this->idSites = array(2,4)
  • Compute Nb Metrics by calling
    • $logAggregator->queryVisitsByDimension(array(2,4)); // this is correct and wanted
  • Then we do the regular plugins archiving which also calls many times
    • $archiver->aggregateMultipleReports()
      • $logAggregator->getIdSites(); // so it basically aggregates the custom reports by aggregating the custom report from idSite 2 and 4 but there wouldn't be an archive for the custom report in the roll up and therefore week, month, year,range be always no data.

The problem is basically that when we archive multiple reports for a roll up idSite 12, then we want to aggregate all archives of idSite = 12 for each day within the week that is being processed. In the current buggy implementation it would have aggregated the archives of all children sites 2 and 4. This is problematic when thinking of custom reports. Where the roll up might have a custom report configured, but the children idSites 2 and 4 don't have a custom report configured. This means it can't find an archive for idSite 2 and 4 and would therefore create an empty archive even though each day in idSite 12 has an archive. This means, aggregateMultipleReports always needs to only have the rollUpReportingIdSite.

On the contrary, when processing unique metrics, then we actually want to operate on all children sites (2 and 4) and include them in the query for the unique visitors and users metric. This is totally opposite behaviour of what we need when aggregating archives.

However, both use the same LogAggregator class and the same list of $parameters->getIdSites() which is determined by Piwik::postEvent('ArchiveProcessor.Parameters.getIdSites', $idSite, $period).. When that event is triggered and roll up reporting decides whether to look at the roll up site, or the children sites, it cannot know behave differently for computeNbMetrics and for aggregateMultipleReports but it is needed to make things work.

Therefore I had to change logic a bit. Also in Roll Up Reporting itself (see PR there).

Basically, we need two different events to ensure correct behaviour.

@tsteur commented on March 17th 2020 Member

Tested it locally a few times with roll up reporting and worked for me.

@diosmosis commented on March 19th 2020 Member

Code looks good, though I would have expected there to be some change to the expected test files? Maybe this logic isn't completely covered?

@tsteur commented on March 19th 2020 Member

The logic is all in Roll Up Reporting and not in core. I added some tests there. This logic pretty much does not affect core itself.

@diosmosis commented on March 19th 2020 Member

I see, I thought the tests were meant to cover this code since there was an event handler added for ArchiveProcessor.ComputeNbUniques.getIdSites to a test.

@tsteur commented on March 19th 2020 Member

true there is some test I forgot about it. The test failed initially but after adding the same sites in that event it fixed the tests showing that it works. That's why there is no change in the expected because just like before it returns the same idSites. Before it got these sites from the ArchiveProcessor.Parameters.getIdSites event

image

This Pull Request was closed on March 24th 2020
Powered by GitHub Issue Mirror