New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix behaviour for segments using "not equals" or "not contains" on an action dimension #16172
Conversation
39f3096
to
c1cc31f
Compare
plugins/Live/Model.php
Outdated
@@ -466,7 +466,7 @@ public function makeLogVisitsQueryString($idSite, $startDate, $endDate, $segment | |||
$filterSortOrder = 'DESC'; | |||
} | |||
|
|||
$segment = new Segment($segment, $idSite); | |||
$segment = new Segment($segment, $idSite, Period\Factory::build('day', implode(',', array_filter([$startDate,$endDate])))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: That actually does not seem to create the correct period, as the bound parameter for the query when choosing the date range 2020-06-15,2020-06-21
are this:
1 = "2020-06-14 22:00:00"
2 = "2020-06-20 21:59:59"
3 = "2020-06-13 22:00:00"
4 = "2020-06-20 21:59:59"
So the inner query actually has other date values. Need to check why...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sgiehl I reckon this goes in the right direction looking good so far 👍
e7b545f
to
b3ad5d9
Compare
b3ad5d9
to
f3e6906
Compare
@tsteur I've done some adjustments and removed the usage of LogAggregator in that case. Also added some tests. Maybe you could have another look and check if there might be some cases that might not work anymore... |
it generated for me a query like this one: /* idSegments = [61] */
SELECT
log_visit.*
FROM
matomo_log_visit AS log_visit
WHERE
( log_visit.idsite in (?)
AND log_visit.visit_last_action_time >= ? )
AND
( ( log_visit.idvisit NOT IN (/* idSegments = [56] */
SELECT
log_inner.idvisit
FROM
(
SELECT
log_visit.idvisit
FROM
matomo_log_visit AS log_visit LEFT JOIN matomo_log_link_visit_action AS log_link_visit_action ON log_link_visit_action.idvisit = log_visit.idvisit
WHERE
( log_visit.idsite IN (?) AND log_visit.visit_last_action_time >= ? )
AND
( ( log_link_visit_action.idaction_url IN (SELECT idaction FROM matomo_log_action WHERE ( name LIKE CONCAT('%', ?, '%') AND type = 1 )) ) )
GROUP BY
log_visit.idvisit
ORDER BY
NULL
) AS log_inner) ) )
GROUP BY
log_visit.idvisit
ORDER BY
log_visit.idsite DESC, log_visit.visit_last_action_time DESC LIMIT 0, 12 I reckon we could remove one of the inner subquery similar to how we do it in the $logQueryBuilder = StaticContainer::get('Piwik\DataAccess\LogQueryBuilder');
$forceGroupByBackup = $logQueryBuilder->getForcedInnerGroupBySubselect();
$logQueryBuilder->forceInnerGroupBySubselect(LogQueryBuilder::FORCE_INNER_GROUP_BY_NO_SUBSELECT);
$query = $segmentObj->getSelectQuery($select, $from, $where, $bind); // could also catch this
$logQueryBuilder->forceInnerGroupBySubselect($forceGroupByBackup); This should get rid of one subquery. Otherwise looks good to me so far @sgiehl |
@tsteur updated the code & tests. Should now be ready for a final review |
Looks generally good. Is the startDate/endDate maybe missing from Or can we maybe not return true for |
Actually when looking through the code for me it seems the segment query shouldn't be used when loading an Archive. But can set those parameters to be sure |
400981d
to
66f38f2
Compare
@sgiehl can we still add |
@tsteur I've adjusted the code, so it will no longer use a subquery if no start date is given. If that is the case and a segment query is built, that would require a subquery a warning will be logged. |
f7f764b
to
dceda15
Compare
|
||
if ($requiresSubQuery && empty($this->startDate)) { | ||
$e = new Exception(); | ||
Log::warning("Avoiding segment subquery due to missing start date. Please ensure a start date is set when initializing a segment if it's used to build a query. Stacktrace:\n" . $e->getTraceAsString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you mean when start and end date is empty?
created follow up issue so we don't forget to act on it #16200
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry just realise shouldn't have merged this yet as it needs to also check if endDate
is empty @sgiehl ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
created #16201
This PR aims to fix the behaviour for segments using "not equals" or "not contains" on an action dimension as described in #13785.
@tsteur I tried to implement it like you suggested in #13785 (comment), but that actually caused a lot problems with segments that contain a union of segments.
When a segment is initialized the union of segments would by default be splitted up into the union of segments. But those segment union needs to be in a single subquery as it would have an incorrect results otherwise. For now I've put the code to generate that subquery into the segment initialize, which requires the period to be available. Not an optimal solution as a segment should not require a period to be initialized, but guess we would need to refactor the whole usage of the Segment class otherwise, as we need to do that stuff while parsing, but the parsing is also needed for methods that are independent from the period...
At least it seems to work the way it's currently implemented. A segment like
actionUrl!@piwik;browserName!@Chrome
generates a query for the visitor log like: