@KarthikRaja1388 opened this Issue on January 8th 2020

Issue: When viewing a report by choosing a date range between years(ex: Dec 2019 - Jan 2020), the numbers are not adding up right. Evolution graph seems to be having a different number than the numbers shown in the reports

Steps to reproduce:

  1. Choose date range for example Dec 30 - Jan 5
  2. View any report, number in the report is not matching to the sum of numbers in the evolution report

image

In the Overview report the visits should be well over 80,000 visits for the chosen period but it's only showing 24,792 visits

Note: Reports works fine if the date range falls in the same year

@tsteur commented on January 8th 2020 Member

@mattab moving this for now out of the backlog. It might be already fixes as part of other issues. And we also have #14123 so could even close it as a duplicate.

@KarthikRaja1388 In case you haven't done yet... Could you maybe invalidate the data for that range or week and archive it again? Or select a slightly different range eg 29-12-2019 to 2020-01-04 to see if it happens there as well? I just tested it and could not reproduce it.

@lolobu commented on January 10th 2020

I have the very same problem with the same date range.
Basically, any report from any date until Jan 4th is correct. Now if I keep the same starting date for the range but add Jan 5th, the total number of visits decrease.
For instance :

  • Dec 30th - Jan 4th : 2315 visits in the visits summary widgets (which matches what the graph gives day per day)
  • Dec 30th - Jan 5th : 1543 visits (but it should be 2315 + 577 from Jan 5th, so 2832 visits)

I get the same discrepancy with any starting date from Dec 2nd until Dec 30th. Adding Jan 5th gives a lower total than ending on Jan 4th.
But if I use Dec 31st instead of Dec 30th as starting date, the total number of visits reported is correct with either Jan 4th or Jan 5th as ending date.
And if the starting date is Dec 1st (or any earlier), the problem is gone also. The totals are correct again.

@tsteur commented on January 11th 2020 Member

I can actually reproduce it indeed. I was selecting up to the 4th when trying to reproduce it and it was working. But to the 5th it is indeed broken. Basically, there must be some issue with processing that week.

@tsteur commented on January 12th 2020 Member

FYI: For a test I just invalidated the data and then archived again and it was showing the correct number afterwards. I also confirmed through debugging that it is using the right dates when archiving.

The number of visits shown for the first week of the new year was some random number. Like it wasn't the sum of 30th+31st or the sum of 1st-5th Jan.
If I had to guess, there was maybe some issue with report invalidation.

Looking at the archive dates it looks like this:

image

So it was not re-archived again in the new year which probably confirms the invalidation issue.

@lolobu commented on January 13th 2020

Indeed, invalidating the data and archiving again fixed the mismatch. Thanks.

@diosmosis commented on July 6th 2020 Member

@tsteur / @KarthikRaja1388 can you still reproduce this?

Seems to be working now (on the demo, locally and on cloud).

@tsteur commented on July 7th 2020 Member

@diosmosis I tried to reproduce it by setting the date back to Jan 1st (and some other days) and then choosing Dec 30 2019 - Jan 5 2020 but couldn't reproduce it this way. I'm pretty sure the bug is still there but couldn't find anything while looking through the code. The problem could be anywhere (invalidating, archiving, ...). I would assume the issue might be around invaliding code or so but maybe not.

I suppose to reproduce the issue we'd need to use Matomo 3.12.0 or so and try somehow reproduce it like this:

  • Set date back to Dec 30 2019
  • Track something
  • Run core archive (not sure if browser archiving needs to be enabled or disabled and if it makes a difference)
  • Set date forward by one day and repeat above steps until July 6th

Then see what happens when selecting the week of Dec 30th to Jan 5th. Possible it's more of an edge case and can be reproduced like this. On Cloud we invalidated all reports back then to repair them.

@KarthikRaja1388 do you remember if this was an issue on the cloud or self hosted? This could give us an indication whether we should try and reproduce it with browser archiving enabled or disabled (it is disabled on the cloud).

If we can't reproduce it, I wonder if the easiest way be to once invalidate the report of the first week at the beginning of the second week somehow. Like a task that checks if it's the second week and if so, invalidates the previous week once (eg by setting a flag for that year in the option table). Just a thought...

@ibril15 commented on August 18th 2020

Hi All,

I seem to be having the same issue. Premised environment running 3.13.0. For me it seems the issue is when the week goes across months.

For Example:

Date Range: Week From 2020-07-27 To 2020-08-02

If I select Custom Range for 2020-07-27 To 2020-08-01 (one day previous), I get 1,661 visits. If I add 2020-08-02 to the range, or use the pre-built Week option for that week, I get 1,407 visits. On 2020-08-02 I have 314 visits.

If I include 2020-08-02 in a range like 2020-07-28 to 2020-08-03 (doesn't include a whole week), the sum of the individual days equals the number of visits and everything looks right. If I change the range to 2020-07-26 to 2020-08-03 (includes a whole week and some extra days), the numbers are wrong again.

So it appears that the issue may be when calculating a preset range (like "Week"), maybe when the week crosses months? When I select a custom date range that includes a pre-set range like week, I'm assuming the range detects that and uses the pre-archived week. And if that pre-archived week is wrong, all the subsequent calculations that depend on it will also be wrong.

I tried to invalidate the archive and re-archive the data but that didn't help. Please let me know if I can do any further troubleshooting.

Thanks a lot.

-Igor

@tsteur commented on August 18th 2020 Member

@ibril15 I wonder if this is maybe a different case. In the other cases invalidating the data and then rearchiving helped. I've tried to reproduce it for these ranges but it works nicely for me. I wonder if the range was maybe not correctly invalidated.

In the original issue the date range is actually a full week and it might be a different problem I think.

@ibril15 commented on August 18th 2020

Ok, I've split my comment off into a #16320. Thanks.

@tsteur commented on August 31st 2020 Member

@diosmosis are you still working on this one?

@KarthikRaja1388 do you remember if this was an issue on the cloud or self hosted? This could give us an indication whether we should try and reproduce it with browser archiving enabled or disabled (it is disabled on the cloud).

I think so far we can't reproduce it and have no idea where something could go possibly go wrong. This issue is incredibly difficult to replicate as we don't know if the issue is related to invalidating or archiving and we always have to set the server time back to our new year and we don't know if the problem happens before or after new year etc.

I'm thinking we might otherwise wait until next year to see if the big archive refactoring maybe fixed it. My guess be it's related to invalidating but it's hard to tell.

@diosmosis commented on August 31st 2020 Member

@tsteur no, I wasn't able to reproduce. I can try and reproduce by checking out 3.12, but given it's an edge case, unless we see it again I'm not sure if it's worth spending a lot of time on it?

@tsteur commented on August 31st 2020 Member

@diosmosis agreed. I'll move this to RC for now. Instead of finding the root cause what we maybe could do is simply adding an invalidation after the first week after new year so if it happens again it would just rearchive the data.

@tsteur commented on August 31st 2020 Member

If we don't want to apply such a workaround could also move it back to the backlog and see if it happens again around new year.

@KarthikRaja1388 commented on September 3rd 2020

@diosmosis are you still working on this one?

@KarthikRaja1388 do you remember if this was an issue on the cloud or self hosted? This could give us an indication whether we should try and reproduce it with browser archiving enabled or disabled (it is disabled on the cloud).

I think so far we can't reproduce it and have no idea where something could go possibly go wrong. This issue is incredibly difficult to replicate as we don't know if the issue is related to invalidating or archiving and we always have to set the server time back to our new year and we don't know if the problem happens before or after new year etc.

I'm thinking we might otherwise wait until next year to see if the big archive refactoring maybe fixed it. My guess be it's related to invalidating but it's hard to tell.

@tsteur I don't remember whether it is cloud or On-premise, but from the image attached I can say it is cloud.

@tsteur commented on October 6th 2020 Member

I'll be moving this one out of the Matomo 4 milestone for now as we can't really reproduce it and we'll wait to see if it happens this year again with Matomo 4.

@mattab commented on February 18th 2021 Member

Thanks for reporting this issue.
Unfortunately we're unable to reproduce the problem.
If anyone is reading this and has seen this behaviour using the latest version of Matomo, please leave a comment and a screenshot (and ideally how you managed to reproduce).
If we don't hear more about this issue, we will close it later. Thanks!

This Issue was closed on February 18th 2021
Powered by GitHub Issue Mirror