Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting wrong reports when choosing date range between years #15363

Closed
KarthikRaja1388 opened this issue Jan 8, 2020 · 17 comments
Closed

Getting wrong reports when choosing date range between years #15363

KarthikRaja1388 opened this issue Jan 8, 2020 · 17 comments
Assignees
Labels
answered For when a question was asked and we referred to forum or answered it. Bug For errors / faults / flaws / inconsistencies etc.

Comments

@KarthikRaja1388
Copy link

Issue: When viewing a report by choosing a date range between years(ex: Dec 2019 - Jan 2020), the numbers are not adding up right. Evolution graph seems to be having a different number than the numbers shown in the reports

Steps to reproduce:

  1. Choose date range for example Dec 30 - Jan 5
  2. View any report, number in the report is not matching to the sum of numbers in the evolution report

image

In the Overview report the visits should be well over 80,000 visits for the chosen period but it's only showing 24,792 visits

Note: Reports works fine if the date range falls in the same year

@mattab mattab added this to the 4.0.0 milestone Jan 8, 2020
@mattab mattab added the Bug For errors / faults / flaws / inconsistencies etc. label Jan 8, 2020
@tsteur
Copy link
Member

tsteur commented Jan 8, 2020

@mattab moving this for now out of the backlog. It might be already fixes as part of other issues. And we also have #14123 so could even close it as a duplicate.

@KarthikRaja1388 In case you haven't done yet... Could you maybe invalidate the data for that range or week and archive it again? Or select a slightly different range eg 29-12-2019 to 2020-01-04 to see if it happens there as well? I just tested it and could not reproduce it.

@lolobu
Copy link

lolobu commented Jan 10, 2020

I have the very same problem with the same date range.
Basically, any report from any date until Jan 4th is correct. Now if I keep the same starting date for the range but add Jan 5th, the total number of visits decrease.
For instance :

  • Dec 30th - Jan 4th : 2315 visits in the visits summary widgets (which matches what the graph gives day per day)
  • Dec 30th - Jan 5th : 1543 visits (but it should be 2315 + 577 from Jan 5th, so 2832 visits)

I get the same discrepancy with any starting date from Dec 2nd until Dec 30th. Adding Jan 5th gives a lower total than ending on Jan 4th.
But if I use Dec 31st instead of Dec 30th as starting date, the total number of visits reported is correct with either Jan 4th or Jan 5th as ending date.
And if the starting date is Dec 1st (or any earlier), the problem is gone also. The totals are correct again.

@tsteur
Copy link
Member

tsteur commented Jan 11, 2020

I can actually reproduce it indeed. I was selecting up to the 4th when trying to reproduce it and it was working. But to the 5th it is indeed broken. Basically, there must be some issue with processing that week.

@tsteur
Copy link
Member

tsteur commented Jan 12, 2020

FYI: For a test I just invalidated the data and then archived again and it was showing the correct number afterwards. I also confirmed through debugging that it is using the right dates when archiving.

The number of visits shown for the first week of the new year was some random number. Like it wasn't the sum of 30th+31st or the sum of 1st-5th Jan.
If I had to guess, there was maybe some issue with report invalidation.

Looking at the archive dates it looks like this:

image

So it was not re-archived again in the new year which probably confirms the invalidation issue.

@lolobu
Copy link

lolobu commented Jan 13, 2020

Indeed, invalidating the data and archiving again fixed the mismatch. Thanks.

@diosmosis
Copy link
Member

diosmosis commented Jul 6, 2020

@tsteur / @KarthikRaja1388 can you still reproduce this?

Seems to be working now (on the demo, locally and on cloud).

@tsteur
Copy link
Member

tsteur commented Jul 7, 2020

@diosmosis I tried to reproduce it by setting the date back to Jan 1st (and some other days) and then choosing Dec 30 2019 - Jan 5 2020 but couldn't reproduce it this way. I'm pretty sure the bug is still there but couldn't find anything while looking through the code. The problem could be anywhere (invalidating, archiving, ...). I would assume the issue might be around invaliding code or so but maybe not.

I suppose to reproduce the issue we'd need to use Matomo 3.12.0 or so and try somehow reproduce it like this:

  • Set date back to Dec 30 2019
  • Track something
  • Run core archive (not sure if browser archiving needs to be enabled or disabled and if it makes a difference)
  • Set date forward by one day and repeat above steps until July 6th

Then see what happens when selecting the week of Dec 30th to Jan 5th. Possible it's more of an edge case and can be reproduced like this. On Cloud we invalidated all reports back then to repair them.

@KarthikRaja1388 do you remember if this was an issue on the cloud or self hosted? This could give us an indication whether we should try and reproduce it with browser archiving enabled or disabled (it is disabled on the cloud).

If we can't reproduce it, I wonder if the easiest way be to once invalidate the report of the first week at the beginning of the second week somehow. Like a task that checks if it's the second week and if so, invalidates the previous week once (eg by setting a flag for that year in the option table). Just a thought...

@ibril15
Copy link

ibril15 commented Aug 18, 2020

Hi All,

I seem to be having the same issue. Premised environment running 3.13.0. For me it seems the issue is when the week goes across months.

For Example:

Date Range: Week From 2020-07-27 To 2020-08-02

If I select Custom Range for 2020-07-27 To 2020-08-01 (one day previous), I get 1,661 visits. If I add 2020-08-02 to the range, or use the pre-built Week option for that week, I get 1,407 visits. On 2020-08-02 I have 314 visits.

If I include 2020-08-02 in a range like 2020-07-28 to 2020-08-03 (doesn't include a whole week), the sum of the individual days equals the number of visits and everything looks right. If I change the range to 2020-07-26 to 2020-08-03 (includes a whole week and some extra days), the numbers are wrong again.

So it appears that the issue may be when calculating a preset range (like "Week"), maybe when the week crosses months? When I select a custom date range that includes a pre-set range like week, I'm assuming the range detects that and uses the pre-archived week. And if that pre-archived week is wrong, all the subsequent calculations that depend on it will also be wrong.

I tried to invalidate the archive and re-archive the data but that didn't help. Please let me know if I can do any further troubleshooting.

Thanks a lot.

-Igor

@tsteur
Copy link
Member

tsteur commented Aug 18, 2020

@ibril15 I wonder if this is maybe a different case. In the other cases invalidating the data and then rearchiving helped. I've tried to reproduce it for these ranges but it works nicely for me. I wonder if the range was maybe not correctly invalidated.

In the original issue the date range is actually a full week and it might be a different problem I think.

@ibril15
Copy link

ibril15 commented Aug 18, 2020

Ok, I've split my comment off into a #16320. Thanks.

@tsteur
Copy link
Member

tsteur commented Aug 31, 2020

@diosmosis are you still working on this one?

@KarthikRaja1388 do you remember if this was an issue on the cloud or self hosted? This could give us an indication whether we should try and reproduce it with browser archiving enabled or disabled (it is disabled on the cloud).

I think so far we can't reproduce it and have no idea where something could go possibly go wrong. This issue is incredibly difficult to replicate as we don't know if the issue is related to invalidating or archiving and we always have to set the server time back to our new year and we don't know if the problem happens before or after new year etc.

I'm thinking we might otherwise wait until next year to see if the big archive refactoring maybe fixed it. My guess be it's related to invalidating but it's hard to tell.

@diosmosis
Copy link
Member

@tsteur no, I wasn't able to reproduce. I can try and reproduce by checking out 3.12, but given it's an edge case, unless we see it again I'm not sure if it's worth spending a lot of time on it?

@tsteur
Copy link
Member

tsteur commented Aug 31, 2020

@diosmosis agreed. I'll move this to RC for now. Instead of finding the root cause what we maybe could do is simply adding an invalidation after the first week after new year so if it happens again it would just rearchive the data.

@tsteur tsteur modified the milestones: 4.0.0, 4.0.0 RC Aug 31, 2020
@tsteur
Copy link
Member

tsteur commented Aug 31, 2020

If we don't want to apply such a workaround could also move it back to the backlog and see if it happens again around new year.

@KarthikRaja1388
Copy link
Author

@diosmosis are you still working on this one?

@KarthikRaja1388 do you remember if this was an issue on the cloud or self hosted? This could give us an indication whether we should try and reproduce it with browser archiving enabled or disabled (it is disabled on the cloud).

I think so far we can't reproduce it and have no idea where something could go possibly go wrong. This issue is incredibly difficult to replicate as we don't know if the issue is related to invalidating or archiving and we always have to set the server time back to our new year and we don't know if the problem happens before or after new year etc.

I'm thinking we might otherwise wait until next year to see if the big archive refactoring maybe fixed it. My guess be it's related to invalidating but it's hard to tell.

@tsteur I don't remember whether it is cloud or On-premise, but from the image attached I can say it is cloud.

@tsteur
Copy link
Member

tsteur commented Oct 6, 2020

I'll be moving this one out of the Matomo 4 milestone for now as we can't really reproduce it and we'll wait to see if it happens this year again with Matomo 4.

@tsteur tsteur removed this from the 4.0.0-RC milestone Oct 6, 2020
@tsteur tsteur added this to the Priority Backlog (Help wanted) milestone Oct 6, 2020
@mattab
Copy link
Member

mattab commented Feb 18, 2021

Thanks for reporting this issue.
Unfortunately we're unable to reproduce the problem.
If anyone is reading this and has seen this behaviour using the latest version of Matomo, please leave a comment and a screenshot (and ideally how you managed to reproduce).
If we don't hear more about this issue, we will close it later. Thanks!

@mattab mattab closed this as completed Feb 18, 2021
@mattab mattab added the answered For when a question was asked and we referred to forum or answered it. label Feb 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
answered For when a question was asked and we referred to forum or answered it. Bug For errors / faults / flaws / inconsistencies etc.
Projects
None yet
Development

No branches or pull requests

6 participants