Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search Keywords with No Results report issue (probably wrong filtering with lots of data) #11280

Open
mkrakiewicz opened this issue Jan 26, 2017 · 1 comment
Labels
Bug For errors / faults / flaws / inconsistencies etc. Waiting for user feedback Indicates the Matomo team is waiting for feedback from the author or other users.

Comments

@mkrakiewicz
Copy link

mkrakiewicz commented Jan 26, 2017

Hi,
I have a problem with "Search Keywords with No Results" report.
Piwik version: 2.16.2.

Problem:

  1. Using a segment in this report results in more data than is available for "All visits" setting (makes no sense). In this case I'm using example segment "visitEcommerceStatus==abandonedCart" but I think it does not matter. This is only happening with lots of data (I have a case of 1 day of 250K actions).

Steps to reproduce:

  1. For some specific day (1 day is enough) have enough data so that this method's: https://github.com/piwik/piwik/blob/2.16.2/plugins/Actions/API.php#L280 resulting DataTable is full and is limited to 499 rows (according to the setting datatable_archiving_maximum_rows_actions = 500).

  2. Have records so that this condition below is fulfilled so that rows with `28`= 1 (`28` is 'INDEX_SITE_SEARCH_HAS_NO_RESULT') are spread evenly in the result set:
    CASE WHEN (MAX(log_link_visit_action.custom_var_v5) = 0 AND log_link_visit_action.custom_var_k5 = '_pk_scount') THEN 1 ELSE 0 END AS `28` . Here is the full archiving query I believe is used for aggregation: https://gist.github.com/mkrakiewicz/e3219368fa32a1a7491d49904a9418cf

  3. Then, resulting rows are filtered AFTER the result set is "clipped" to 500 records:
    https://github.com/piwik/piwik/blob/2.16.2/plugins/Actions/API.php#L286

  4. If I set datatable_archiving_maximum_rows_actions = -1, then the report is correct (after choosing a segment there are less keywords than in "All results" setting). Also several missing keywords appear. However then in above API method the DataTable row count is way more than default 500 (thousands in my case).

Conclusion:
Because the archiving query has several subqueries/groupings, there can be more records matching `28`= 1 than DataTable allows. Also, because each keyword is listed, this column cannot be summed in other way I believe. That means, even if in DataTable there are 500 rows, in the report there can be a few keywords or 0 records after filtering is done, while in reality there are several keywords which were cut off from the result.

Proposed fix:

  1. Remove the filtering in API method.

  2. Apply filtering on query level during archiving and have separate report for that. Currently Actions_sitesearch is used both by getSiteSearchNoResultKeywords and getSiteSearchKeywords.

@mattab
Copy link
Member

mattab commented Feb 21, 2017

Thanks for the report @mkrakiewicz

It would be super helpful if you could submit a Pull request with a new System test that shows the bug. This way we can see it reproduced and it's much easier for us to fix it afterwards. Let me know... thanks

@mattab mattab added this to the Backlog (Help wanted) milestone Feb 21, 2017
@mattab mattab added Bug For errors / faults / flaws / inconsistencies etc. Waiting for user feedback Indicates the Matomo team is waiting for feedback from the author or other users. labels Feb 21, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For errors / faults / flaws / inconsistencies etc. Waiting for user feedback Indicates the Matomo team is waiting for feedback from the author or other users.
Projects
None yet
Development

No branches or pull requests

2 participants