@pestevao opened this Issue on September 14th 2017

Using piwik 3.1.0 but this happens since... we start using piwik back in 2014.

In the API method call VisitTime.getVisitInformationPerServerTime – index.php?module=API&method=VisitTime.getVisitInformationPerServerTime&format=XML&idSite=3&period=day&date=today&expanded=1&token_auth=xxx – in order to export actions data to some other internal system to make integrated reports we watch that the current hour period, in our case, have so much higher values than all of the previous periods. And this wasn’t expected.

Further investigation on this issue we found that some users have extremely large active sessions – in our case it varies from some hours to some weeks – and these sessions cumulative actions value appear in current hour as actions for this period. All of them! As the hour pass the system make some different query and these sessions actions are no longer taken because they remain currently active and are reported as the current period and so on until the session became ended.

In the log_visit table the date value on visit_last_action_time is always a valid current date never make the session expire and is true it is the same active session but the actions wasn’t made all in the current period and that is the part that in my assumption is wrong. And for that reason this extremely high value in the current hour.

When the current hour became the previous hour piwik will show the “correct-as-expected-evolution” value for actions. And the “wrong” value will be shown in the current hour period again plus the “right” values for “normal” sessions.

In the table below when 10h become the previous hour it will so a value similar to 40 000 and 11h became the current one with the high value for actions.

For example:

isits by Server Time
0h 4138 3853 - 12654 3.06 5 min 5s 57.56%
1h 2095 1884 - 6336 3.02 4 min 39s 58.47%
2h 1244 1060 - 3735 3 4 min 33s 63.83%
3h 903 730 - 2253 2.5 3 min 3s 69.55%
4h 754 593 - 1826 2.42 2 min 46s 70.29%
5h 869 702 - 1804 2.08 2 min 13s 73.3%
6h 1704 1536 - 3558 2.09 2 min 10s 69.42%
7h 3943 3741 - 9349 2.37 2 min 40s 60.13%
8h 8447 8028 - 24183 2.86 4 min 36s 49.28%
9h 14359 13397 - 42453 2.96 5 min 45s 49.01%
10h 11016 10763 - 128422 11.66 10

Similar query – with smaller human readable tune – of the one captured in MySQL for make the upper table:

select '10' as current_hour ,count(log_visit.idvisit), sum(log_visit.visit_total_actions)
FROM log_visit AS log_visit
WHERE log_visit.visit_last_action_time >= '2014-11-10 10:00:00'
AND log_visit.visit_last_action_time <= '2014-11-10 10:59:59'
AND log_visit.idsite = '3'
AND log_visit.visit_entry_idaction_url >0

@mattab commented on September 18th 2017 Member

Hi @pestevao
If I understand correctly, many of your users have visits that last for days. Usually a visit lasts a few minutes or a few hours. When the visit lasts for days, it won't appear in the reports until the visit ends, EXCEPT (as you note) for "Today's" data.

Could you let us know why is it that your users' visit last for days? is there some kind of auto refresh in your code or so?

@pestevao commented on September 18th 2017

Hello @mattab,
We really don't know why this happen. Maybe some remote users are crawling our payed content and we validate subscriptions and login previous from download or accesing the content. We really can't find a clue on this or why is this hapening.
The only way to "solve it" is by making a query directly on MySQL similar to the one above.

Powered by GitHub Issue Mirror