The sum_daily_nb_uniq_visitors is incorrect for certain data ranges when calling API methods using period=range. I've discovered this issue within UserCountry, DeviceDetection, UserSettings, and Provider methods. I suspect it exists in more, but my test have only included those so far.
To reproduce from demo.piwik.org:
Referencing the results for Germany in the following UserCountry reports:
returns nb_visits = 5380, sum_daily_nb_uniq_visitors = 4759
Add one day -> 2013-11-01 to 2013-12-01:
returns an empty result set!
Add another day -> 2013-11-01 to 2013-12-02:
nb_visits = 5696, sum_daily_nb_uniq_visitors = 289 (!?)
Clearly the 2nd API call returning nothing is a problem. With the 3rd, you can see that the increase of 2 days from the 1st call increased the visit count by a believable number, but the unique visitors total drop dramatically from 4759 to only 289. That is impossible.
Keywords: sum_daily_nb_uniq_visitors API range
Thanks for the report!
Do you think this is a new bug (regression) in 2.0-beta, or was the bug already there in 1.12 ?
I was experiencing similar results with 1.12 as well, however I had never before received an empty result set as in test #2 until recently with 2.0b-11.
I've also found another issue that may be related which is detailed in the following post.
They don't seem to follow the same pattern to generate the bad results. Still, they both involve irregular visit report numbers on API calls using period=range on some date ranges, so there may be a connection.
To reproduce the nb_visits error, look at the nb_visits value for Germany in the following links...
From Nov 1 to Nov 8, it reports 1490 visits.
From Nov 1 to Nov 15, it reports 1470 visits.
From Nov 1 to Nov 25, it reports 543 visits.
From Nov 1 to Nov 30, it reports 5380 visits.
As of 2.0.2, the second example of the sum_daily_nb_uniq_visitors tests no longer returns an empty result set. It is instead returning a value which is certainly incorrect. In fact it looks as if in the cases of testing ranges like 2013-11-01 to 2013-12-01 and 2013-11-01 to 2013-12-02, the value returned is the sum only for the dates in the month of December. Its as if it is ignoring the values from November entirely.
(re-writing here my post on http://forum.piwik.org/read.php?2,110025,110045 with more details )
I sometimes experience the very same problem on several of my websites tracked using Piwik 1.12, but it happend again right now when testing 2.1-rc3.
The workaround I use to fix the problem when I see it on a specific period, is to run an "invalidateArchivedReports" operation:
and then re-launch archive.php:
sudo -u apache php /.../misc/cron/archive.php --url=http://... --force-idsites=61 --force-all-periods
Note: the erratic metrics I had right now were not for sum_daily_nb_uniq_visitors but for visits and actions. The unique visitors metric was the same after the workaround. So here the number of visits was lower than the number of unique visitors because the number of visits was wrong.
This bug may be a consequence of #4532 so maybe it is fixed since the 2.1 release.
It appears that the secondary issue I reported on in comment 3 has been resolved as of 2.1. The original issue, however, is still open. The results are a bit different than I described in the post, but the sum_daily_nb_uniq_visitors are still wrong, nonetheless.
I was able to reproduce this and found the actual issue. Will provide a fix but not sure whether it will be the best solution. There are many different ways to fix this issue...
In 9e86c7960608c42aebe1966bb368bb07cc4a34bc: refs #4377 make sure metrics like sum_daily_nb_uniq_visitors (which are renamed after aggregation) are summed correctly. If period is for instance 2014-04-01,2014-05-01 we will sum two periods. The month of April 2014 and May 1st. The dataTable of the month will already contain the renamed column (as it was aggregated before) whereas May 1st datatable will not contain the renamend column but the original. Both columns cannot be summed therefore and the original column will overwrite the value of the renamed column. Meaning sum_daily_nb_uniq_visitors is in this case always the value of May 1st
Note: To test this you have to invalidate all existing range archives (period=5)
In 7ca0a8c32378be8c4a911b1d3ea1475a3d8c7230: refs #4377 added link to ticket
In 140562dfc280a38ca292073d3434f525b14511dd: refs #4377 added test for this use case
In 12a1c2e2606c8236f241078b76e0049c992ffba7: refs #4377 fix some tests
In b39aadee3df8c94a33f444685527abb2a0163693: refs #4377 some more test fixes