@tsteur opened this Pull Request on May 6th 2021 Member

Description:

I noticed a problem in the log aggregator that affects especially non-core plugins that aren't tracking. For example A/B testing were where clauses ignored when a segment is used.

For example when calling

$select = 'log_abtesting.idvariation as label, count(distinct log_abtesting.idvisitor) as uniqueVisitors, log_abtesting.entered as entered';
$from = ['log_abtesting'];
$where = 'log_abtesting.server_time >= ?
                AND log_abtesting.server_time <= ?
                AND log_abtesting.idsite IN (?) AND log_abtesting.idexperiment = 1';
$groupBy = 'log_abtesting.idvariation, log_abtesting.entered';
$this->logAggregator->generateQuery($select, $from, $where, $groupBy, $orderBy='');

Then it would generate below query:

SELECT /* segmenthash cc1bf84e0798d633ba96c7e24983cb07 */ /* sites 1 */ /* 2020-08-27,2021-05-06 */
                log_abtesting.idvariation as label, count(distinct log_abtesting.idvisitor) as uniqueVisitors, log_abtesting.entered as entered
            FROM
                matomo_logtmpsegmentad7af0eba5c424ef32a77fe905923136 AS logtmpsegmentad7af0eba5c424ef32a77fe905923136 INNER JOIN matomo_log_abtesting AS log_abtesting ON log_abtesting.idvisit = logtmpsegmentad7af0eba5c424ef32a77fe905923136.idvisit
            WHERE
                log_abtesting.idexperiment = 1
            GROUP BY
                log_abtesting.idvariation, log_abtesting.entered

as you notice the where clause is completely gone! And it wasn't applied in the table matomo_logtmpsegmentad7af0eba5c424ef32a77fe905923136 because we create the tmp table only once during the entire archiving and there is only this where applied: https://github.com/matomo-org/matomo/blob/4.3.0-b3/core/DataAccess/LogAggregator.php#L360 which is

log_visit.visit_last_action_time>= ? AND log_visit.visit_last_action_time<= ? AND log_visit.idsite IN (...)"

The where part of the actual query gets lost in https://github.com/matomo-org/matomo/blob/4.3.0-b3/core/DataAccess/LogAggregator.php#L382-L391 because it removes the where part thinking it is already applied in the logtmpsegment table but this is not the case.

With this PR it actually results in the correct query:

SELECT /* segmenthash cc1bf84e0798d633ba96c7e24983cb07 */ /* sites 1 */ /* 2020-08-27,2021-05-06 */
                log_abtesting.idvariation as label, count(distinct log_abtesting.idvisitor) as uniqueVisitors, log_abtesting.entered as entered
            FROM
                matomo_logtmpsegmentad7af0eba5c424ef32a77fe905923136 AS logtmpsegmentad7af0eba5c424ef32a77fe905923136 INNER JOIN matomo_log_abtesting AS log_abtesting ON log_abtesting.idvisit = logtmpsegmentad7af0eba5c424ef32a77fe905923136.idvisit
            WHERE
                log_abtesting.server_time >= ?
                AND log_abtesting.server_time <= ?
                AND log_abtesting.idsite IN (?) AND log_abtesting.idexperiment = 1
            GROUP BY
                log_abtesting.idvariation, log_abtesting.entered

I'm not 100% sure why no tests are failing now in core itself. Either they always included log_visit table and where condition or the where condition started differently. It is definitely an issue for A/B testing and from what I can see also other premium features.

Review

  • [ ] Functional review done
  • [ ] Potential edge cases thought about (behavior of the code with strange input, with strange internal state or possible interactions with other Matomo subsystems)
  • [ ] Usability review done (is anything maybe unclear or think about anything that would cause people to reach out to support)
  • [ ] Security review done see checklist
  • [ ] Code review done
  • [ ] Tests were added if useful/possible
  • [ ] Reviewed for breaking changes
  • [ ] Developer changelog updated if needed
  • [ ] Documentation added if needed
  • [ ] Existing documentation updated if needed
This Pull Request was closed on May 6th 2021
Powered by GitHub Issue Mirror