New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong data in visitors overview over time #15170
Comments
@kiliandr check your webserver (and PHP) error logs if there's any error in there. And I recommend you echo the output of the cron into a log file as well to see if there are ever any errors or so. Otherwise from our side everything seems fine and not sure there's much we can do. Be good to check for errors and double check if you have any custom configuration in |
Not sure if it is connected, but I also see some odd calculation after updating to 3.12.0. No errors in logs, cron runs smooth. |
@tsteur thanks. It is a standard configuration on a standard webhoster package. |
Any chance you can send us access to your Matomo install together with some instructions on where/how to reproduce this issue to hello at matomo.org ? |
When you say visits over time do you sum eg the days manually? When you say visitors do you mean unique visitors or visits? Do you sum days, weeks or months? |
Thanks for sending us the details @kiliandr I can reproduce the issue and the sum of each individual day is indeed not the sum of the shown visits for month. Since we have a few reports around this recently, I'm thinking this may be the issue: https://github.com/matomo-org/matomo/blob/3.13.0/core/ArchiveProcessor/Rules.php#L298-L302 Since we removed temporary archives in 3.11 or 3.12, we need to always include them just like we did with temporary archives. Otherwise these archives won't be included in https://github.com/matomo-org/matomo/blob/3.13.0/core/DataAccess/ArchiveSelector.php#L160-L165 I suppose the logic will need to make sure we also fetch invalidated archives but always use the most recent archiveId and prefer a done archive if there is one. This might be also related to #15226 Just a guess but reckon the issue could be around there. @diosmosis any thoughts maybe? I'm not so much into this right now. |
@tsteur I tried to reproduce this in a test:
but it's passing for me w/ the current code. Can you provide some steps to how you reproduced? I wonder if this is somehow a race condition where archives are invalidated while archiving is going on... |
@diosmosis not sure how to reproduce but it looks like there could be various edge cases where it maybe doesn't work. Like It is trying to archive week, and doesn't include today because it only has an invalid archive. Then because of some race conditions it might set the week as done when it is processing the previous week even though the last day within the week might only have an invalid archive so far for some reason. Maybe people select the week as a range and it gets archived as done or something. In general, the invalid archives have basically become the temporary archives and we used to always look at them so it sounds to be like we should always also look at an invalid archive. But the code seems to not do this under some conditions. |
In this case it should initiate archiving for the invalidated day archive (that is what my test ended up doing).
Strangely when I add that as a possible value and ran the test, I ended up getting a completely incorrect value... I'll keep looking. |
Ok, I think I found the race condition:
I was going to attempt a fix by making sure invalidated archives are used in ArchiveProcessor. However that's not the whole fix since we need to be able to invalidate in progress archives as well, so the week/month/etc. in this case eventually gets reprocessed. Or perhaps we don't perform actual invalidation of archives if archiving is on-going? |
It should invalidate that week period as well. I guess that's maybe why we had temporary archives in the first place. Do you have an idea to workaround it? |
Also in general between archive start and archive end there might be only an invalidated archive there and no done archive? Like CronArchive starts, it invalidates an archive, then the tracker starts, and say 30 min later it writes the done archive (it may take a while for the archive to finish)? |
Temp archives were around well before invalidated archives were. I suspect we just never ran into the race condition when implementing invalidating archives since we never had to think about it... Don't know how to work around it just yet.
That's what I thought, if there's only one core:archive process then the invalidation should only occur once and not interfere with archiving. But looking at the archives on demo.matomo.org, there's a day archive that's invalidated w/ a timestamp similar to the week archive, and another day archive w/ DONE_OK 2 mins later that I think corresponds to the month archive. I'm not 100% sure how it happens, but I think something needs to be reworked in how invalidation is done. |
@johsin18 Does this happen with relatively recent data or just old data? |
Well, as you can see it was the week of Nov 25th, 2019. I noticed the problem already the Monday after. I had the impression that it was caused by the Matomo update to 3.13.0, which I installed at some point in time in that week (released on 27th). |
@johsin18 just to confirm, you can't see the issue with old data, correct? |
No, old data is correct. |
In Matomo 3.12.0 a bug appeared that i have described here:
https://forum.matomo.org/t/visitors-overview-not-updating/34804/7
The text was updated successfully, but these errors were encountered: