Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better handling of processing uniques for multiple sites #15696

Merged
merged 7 commits into from Mar 24, 2020

Conversation

tsteur
Copy link
Member

@tsteur tsteur commented Mar 16, 2020

Fixes a problem with roll up reporting when enable_processing_unique_visitors_multiple_sites = 1.

The problem is assuming idSite is a roll up (say idSite = 12 which has as children sites the idsites 2 and 4), and period is say week (same applies for month, year, range):

  • new LogAggregator(new Parameters($idSite, $period)).
  • LogAggregator::__construct
    • $this->idSites = Piwik::postEvent('ArchiveProcessor.Parameters.getIdSites', $idSite, $period). // meaning $this->idSites = array(2,4)
  • Compute Nb Metrics by calling
    • $logAggregator->queryVisitsByDimension(array(2,4)); // this is correct and wanted
  • Then we do the regular plugins archiving which also calls many times
    • $archiver->aggregateMultipleReports()
      • $logAggregator->getIdSites(); // so it basically aggregates the custom reports by aggregating the custom report from idSite 2 and 4 but there wouldn't be an archive for the custom report in the roll up and therefore week, month, year,range be always no data.

The problem is basically that when we archive multiple reports for a roll up idSite 12, then we want to aggregate all archives of idSite = 12 for each day within the week that is being processed. In the current buggy implementation it would have aggregated the archives of all children sites 2 and 4. This is problematic when thinking of custom reports. Where the roll up might have a custom report configured, but the children idSites 2 and 4 don't have a custom report configured. This means it can't find an archive for idSite 2 and 4 and would therefore create an empty archive even though each day in idSite 12 has an archive. This means, aggregateMultipleReports always needs to only have the rollUpReportingIdSite.

On the contrary, when processing unique metrics, then we actually want to operate on all children sites (2 and 4) and include them in the query for the unique visitors and users metric. This is totally opposite behaviour of what we need when aggregating archives.

However, both use the same LogAggregator class and the same list of $parameters->getIdSites() which is determined by Piwik::postEvent('ArchiveProcessor.Parameters.getIdSites', $idSite, $period).. When that event is triggered and roll up reporting decides whether to look at the roll up site, or the children sites, it cannot know behave differently for computeNbMetrics and for aggregateMultipleReports but it is needed to make things work.

Therefore I had to change logic a bit. Also in Roll Up Reporting itself (see PR there).

Basically, we need two different events to ensure correct behaviour.

@tsteur tsteur added the Pull Request WIP Indicates the current pull request is still work in progress and not ready yet for a review. label Mar 16, 2020
@tsteur tsteur added this to the 3.13.5 milestone Mar 16, 2020
@tsteur tsteur added Needs Review PRs that need a code review and removed Pull Request WIP Indicates the current pull request is still work in progress and not ready yet for a review. labels Mar 17, 2020
$sites = $this->getIdSitesToComputeNbUniques();

if (count($sites) > 1 && Rules::shouldSkipUniqueVisitorsCalculationForMultipleSites()) {
if ($periodLabel != 'day') {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when period = day, we keep the aggregated unique visitor metric which is calculated by summing the unique visitor metric of each indidividual website

@tsteur
Copy link
Member Author

tsteur commented Mar 17, 2020

Tested it locally a few times with roll up reporting and worked for me.

@diosmosis
Copy link
Member

Code looks good, though I would have expected there to be some change to the expected test files? Maybe this logic isn't completely covered?

@tsteur
Copy link
Member Author

tsteur commented Mar 19, 2020

The logic is all in Roll Up Reporting and not in core. I added some tests there. This logic pretty much does not affect core itself.

@diosmosis
Copy link
Member

I see, I thought the tests were meant to cover this code since there was an event handler added for ArchiveProcessor.ComputeNbUniques.getIdSites to a test.

@tsteur
Copy link
Member Author

tsteur commented Mar 19, 2020

true there is some test I forgot about it. The test failed initially but after adding the same sites in that event it fixed the tests showing that it works. That's why there is no change in the expected because just like before it returns the same idSites. Before it got these sites from the ArchiveProcessor.Parameters.getIdSites event

image

@tsteur tsteur merged commit 49e8a1c into 3.x-dev Mar 24, 2020
@tsteur tsteur deleted the skipuniquevisitors branch March 24, 2020 21:23
jonasgrilleres pushed a commit to 1024pix/pix-analytics that referenced this pull request Sep 22, 2020
jbuget pushed a commit to 1024pix/pix-analytics that referenced this pull request Sep 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Review PRs that need a code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants