New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do we need to create a temporary segment table for unique week|month|year periods? #17750
Comments
I didn't test at all, and not even sure it's the right place, and where it be fine to do it but meaning maybe we can do something like this (again, might be wrong place or we might not want it) diff --git a/core/ArchiveProcessor.php b/core/ArchiveProcessor.php
index b231d57eb4..11e00c17da 100644
--- a/core/ArchiveProcessor.php
+++ b/core/ArchiveProcessor.php
@@ -527,6 +527,7 @@ class ArchiveProcessor
protected function computeNbUniques($metrics, $sites)
{
$logAggregator = $this->getLogAggregator();
+ $previous = $logAggregator->disallowUsageSegmentCache();
$sitesBackup = $logAggregator->getSites();
$logAggregator->setSites($sites);
@@ -534,6 +535,9 @@ class ArchiveProcessor
$query = $logAggregator->queryVisitsByDimension(array(), false, array(), $metrics);
} finally {
$logAggregator->setSites($sitesBackup);
+ if ($previous) {
+ $logAggregator->allowUsageSegmentCache();
+ }
}
$data = $query->fetch();
return $data;
diff --git a/core/DataAccess/LogAggregator.php b/core/DataAccess/LogAggregator.php
index e283af2691..6e2d6b0b17 100644
--- a/core/DataAccess/LogAggregator.php
+++ b/core/DataAccess/LogAggregator.php
@@ -265,6 +265,14 @@ class LogAggregator
$this->allowUsageSegmentCache = true;
}
+
+ public function disallowUsageSegmentCache()
+ {
+ $previous = $this->allowUsageSegmentCache;
+ $this->allowUsageSegmentCache = false;
+ return $previous;
+ }
+
|
Might merit testing first. From testing hyperloglog, it seemed as if the DISTINCT handling was the cause of slow performance. (The queries there were slow and had no segment. I guess we could compare a bare DISTINCT query vs. a DISTINCT query with a segment.) |
Just fyi @diosmosis it might in the end do the same query again pretty much anyway as the temp segment table would have only the distinct idvisits so it would basically do yet another distinct for the idvisitor etc. |
This particular query is actually coming from Cohorts. There is also the unique visitors and unique users query from core which is a slightly different query. |
Closing this issue as a wontfix. If there wasn't the Cohorts plugin, then we would apply #17827 . I've been debugging and when Cohorts is installed, then the same created temporary segment table will be reused thus likely saving us time by creating this temporary table (as the segment query can be very slow, not meaning the distinct query). If there wasn't cohorts then we would be applying this patch and make it a bit faster for On-Premise users. We can't have an if/else depending on if Cohorts is installed because it would make test results quite unreliable as you would need to execute all tests around this with and without that plugin. Also by having the temporary table if ever any other plugin was to use unique data then this will help there too. |
@tsteur if it helps there could be a setting to disable the logic set via DI, would that work? |
Seeing queries like these:
As I think we mostly only calculate unique visitors for week|month|year|range I wonder if it's actually needed to create a temporary table for these metrics? Might be a lot faster to just get the unique numbers directly?
The text was updated successfully, but these errors were encountered: