Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sum_daily_nb_uniq_visitors calculations incorrect for some ranges in many API methods #4377

Closed
jasonbukowski opened this issue Dec 12, 2013 · 15 comments
Labels
Bug For errors / faults / flaws / inconsistencies etc.
Milestone

Comments

@jasonbukowski
Copy link

The sum_daily_nb_uniq_visitors is incorrect for certain data ranges when calling API methods using period=range. I've discovered this issue within UserCountry, DeviceDetection, UserSettings, and Provider methods. I suspect it exists in more, but my test have only included those so far.

To reproduce from demo.piwik.org:

Referencing the results for Germany in the following UserCountry reports:

2013-11-01 to 2013-11-30:
http://demo.piwik.org/?module=API&method=UserCountry.getCountry&idSite=7&period=range&date=2013-11-01,2013-11-30&format=xml&token_auth=anonymous

returns nb_visits = 5380, sum_daily_nb_uniq_visitors = 4759

Add one day -> 2013-11-01 to 2013-12-01:
http://demo.piwik.org/?module=API&method=UserCountry.getCountry&idSite=7&period=range&date=2013-11-01,2013-12-01&format=xml&token_auth=anonymous

returns an empty result set!

Add another day -> 2013-11-01 to 2013-12-02:
http://demo.piwik.org/?module=API&method=UserCountry.getCountry&idSite=7&period=range&date=2013-11-01,2013-12-02&format=xml&token_auth=anonymous

nb_visits = 5696, sum_daily_nb_uniq_visitors = 289 (!?)

Clearly the 2nd API call returning nothing is a problem. With the 3rd, you can see that the increase of 2 days from the 1st call increased the visit count by a believable number, but the unique visitors total drop dramatically from 4759 to only 289. That is impossible.
Keywords: sum_daily_nb_uniq_visitors API range

@mattab
Copy link
Member

mattab commented Dec 13, 2013

Thanks for the report!

Do you think this is a new bug (regression) in 2.0-beta, or was the bug already there in 1.12 ?

@jasonbukowski
Copy link
Author

I was experiencing similar results with 1.12 as well, however I had never before received an empty result set as in test #2 until recently with 2.0b-11.

@jasonbukowski
Copy link
Author

I've also found another issue that may be related which is detailed in the following post.

[http://forum.piwik.org/read.php?2,108244,108311#msg-108311]

They don't seem to follow the same pattern to generate the bad results. Still, they both involve irregular visit report numbers on API calls using period=range on some date ranges, so there may be a connection.

To reproduce the nb_visits error, look at the nb_visits value for Germany in the following links...

http://demo.piwik.org/?module=API&method=UserCountry.getCountry&idSite=7&period=range&date=2013-11-01,2013-11-08&format=xml&token_auth=anonymous

From Nov 1 to Nov 8, it reports 1490 visits.

http://demo.piwik.org/?module=API&method=UserCountry.getCountry&idSite=7&period=range&date=2013-11-01,2013-11-15&format=xml&token_auth=anonymous

From Nov 1 to Nov 15, it reports 1470 visits.

http://demo.piwik.org/?module=API&method=UserCountry.getCountry&idSite=7&period=range&date=2013-11-01,2013-11-25&format=xml&token_auth=anonymous

From Nov 1 to Nov 25, it reports 543 visits.

http://demo.piwik.org/?module=API&method=UserCountry.getCountry&idSite=7&period=range&date=2013-11-01,2013-11-30&format=xml&token_auth=anonymous

From Nov 1 to Nov 30, it reports 5380 visits.

@jasonbukowski
Copy link
Author

As of 2.0.2, the second example of the sum_daily_nb_uniq_visitors tests no longer returns an empty result set. It is instead returning a value which is certainly incorrect. In fact it looks as if in the cases of testing ranges like 2013-11-01 to 2013-12-01 and 2013-11-01 to 2013-12-02, the value returned is the sum only for the dates in the month of December. Its as if it is ignoring the values from November entirely.

@anonymous-matomo-user
Copy link

(re-writing here my post on http://forum.piwik.org/read.php?2,110025,110045 with more details )

I sometimes experience the very same problem on several of my websites tracked using Piwik 1.12, but it happend again right now when testing 2.1-rc3.

The workaround I use to fix the problem when I see it on a specific period, is to run an "invalidateArchivedReports" operation:

http://.../index.php?token_auth=...&module=API&method=CoreAdminHome.invalidateArchivedReports&idSites=61&dates=2014-02-03,2014-02-04

and then re-launch archive.php:

sudo -u apache php /.../misc/cron/archive.php --url=http://... --force-idsites=61 --force-all-periods

Note: the erratic metrics I had right now were not for sum_daily_nb_uniq_visitors but for visits and actions. The unique visitors metric was the same after the workaround. So here the number of visits was lower than the number of unique visitors because the number of visits was wrong.

@anonymous-matomo-user
Copy link

This bug may be a consequence of #4532 so maybe it is fixed since the 2.1 release.

@jasonbukowski
Copy link
Author

It appears that the secondary issue I reported on in comment 3 has been resolved as of 2.1. The original issue, however, is still open. The results are a bit different than I described in the post, but the sum_daily_nb_uniq_visitors are still wrong, nonetheless.

@tsteur
Copy link
Member

tsteur commented May 7, 2014

I was able to reproduce this and found the actual issue. Will provide a fix but not sure whether it will be the best solution. There are many different ways to fix this issue...

@tsteur
Copy link
Member

tsteur commented May 7, 2014

In 9e86c79: refs #4377 make sure metrics like sum_daily_nb_uniq_visitors (which are renamed after aggregation) are summed correctly. If period is for instance 2014-04-01,2014-05-01 we will sum two periods. The month of April 2014 and May 1st. The dataTable of the month will already contain the renamed column (as it was aggregated before) whereas May 1st datatable will not contain the renamend column but the original. Both columns cannot be summed therefore and the original column will overwrite the value of the renamed column. Meaning sum_daily_nb_uniq_visitors is in this case always the value of May 1st

@tsteur
Copy link
Member

tsteur commented May 7, 2014

Note: To test this you have to invalidate all existing range archives (period=5)

@tsteur
Copy link
Member

tsteur commented May 7, 2014

In 7ca0a8c: refs #4377 added link to ticket

@tsteur
Copy link
Member

tsteur commented May 7, 2014

In 140562d: refs #4377 added test for this use case

@tsteur
Copy link
Member

tsteur commented May 7, 2014

In 12a1c2e: refs #4377 fix some tests

@tsteur
Copy link
Member

tsteur commented May 7, 2014

In b39aade: refs #4377 some more test fixes

@mattab
Copy link
Member

mattab commented May 7, 2014

In e163969: fix tests refs #4377

@jasonbukowski jasonbukowski added this to the 2.3.0 - Piwik 2.3.0 milestone Jul 8, 2014
sabl0r pushed a commit to sabl0r/piwik that referenced this issue Sep 23, 2014
…s (which are renamed after aggregation) are summed correctly. If period is for instance 2014-04-01,2014-05-01 we will sum two periods. The month of April 2014 and May 1st. The dataTable of the month will already contain the renamed column (as it was aggregated before) whereas May 1st datatable will not contain the renamend column but the original. Both columns cannot be summed therefore and the original column will overwrite the value of the renamed column. Meaning sum_daily_nb_uniq_visitors is in this case always the value of May 1st
sabl0r pushed a commit to sabl0r/piwik that referenced this issue Sep 23, 2014
sabl0r pushed a commit to sabl0r/piwik that referenced this issue Sep 23, 2014
sabl0r pushed a commit to sabl0r/piwik that referenced this issue Sep 23, 2014
sabl0r pushed a commit to sabl0r/piwik that referenced this issue Sep 23, 2014
sabl0r pushed a commit to sabl0r/piwik that referenced this issue Sep 23, 2014
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For errors / faults / flaws / inconsistencies etc.
Projects
None yet
Development

No branches or pull requests

4 participants