Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent archiving of data for time periods that start in the future #18790

Merged
merged 4 commits into from Mar 8, 2022

Conversation

bx80
Copy link
Contributor

@bx80 bx80 commented Feb 14, 2022

Description:

Fixes #18773

When archiving a week, month or year some sub-periods may fall in the future (eg. it's Thursday so we want to archive this week, but Friday, Saturday and Sunday are in the future).

Writing blank archive data for the future causes problems (and is inefficient) so the code in Archive.php#L629-L643 skips any archives for periods with a start date in the future or before the website was created. However for some reason it only skips archives with start dates two days greater than the current date, this causes blank archives to be written for today+1 and today+2.

This is a simple fix to skip any periods which start on a date later than the current date.

Prior to this change a running ./console core:archive command would result in Archive::prepareArchive() being called for dates two days into the future, after the change the method is only called up to and include the current date.

Review

@bx80 bx80 added the not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org. label Feb 14, 2022
@bx80 bx80 added this to the 4.8.0 milestone Feb 14, 2022
@bx80 bx80 self-assigned this Feb 14, 2022
@bx80 bx80 added the Needs Review PRs that need a code review label Feb 14, 2022
@sgiehl
Copy link
Member

sgiehl commented Feb 14, 2022

@bx80 could you maybe add some tests for that? Could be a simple test that triggers the archiving and checks which records are created in the archive tables.
Note: To ensure the test assumes a certain date is in the future, you can manipulate the "current time" by setting Date::$now

@sgiehl
Copy link
Member

sgiehl commented Feb 21, 2022

The code looks fine. But what I'm not fully sure about is, if that might cause issues when using other timezones. But guess even if the website is in a timezone that is ahead UTC, the start date should still not be in the future. 🤔 🤯

@sgiehl
Copy link
Member

sgiehl commented Feb 23, 2022

@tsteur would you mind having a quick look at this one. I guess it should be fine. But I'm always unsure if those changes could result in any timezone related issues. Maybe we should check if we have enough tests that are using other timezones to ensure this won't cause any problems?

@sgiehl sgiehl removed the Needs Review PRs that need a code review label Feb 23, 2022
Copy link
Member

@tsteur tsteur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I was going to say it's no problem because it's all in UTC. However, I was then double checking and set timezone to Pacific/Auckland for a site. It's now Feb 24th 10AM in Auckland.

UTC today is Feb 23rd.

Matomo correctly wants to archive data for Feb 24th. It should already allow archiving Feb 24th here but it would not allow it. It goes into continue. That's a problem.

In case it's useful:

$today->getDatetime(); //2022-02-23 00:00:00
 \Piwik\Date::now()->getDatetime(); // 2022-02-23 21:02:02
 \Piwik\Date::factory('now', 'Pacific/Auckland')->getDatetime(); //2022-02-24 10:02:02

@sgiehl
Copy link
Member

sgiehl commented Feb 24, 2022

Ok. Guess we should then start adding some archiving tests for sites with timezones ahead of UTC, while faking a specific time to ensure the timezone is already a day ahead.

@bx80
Copy link
Contributor Author

bx80 commented Feb 28, 2022

I've added an additional test which will expect the current local date to be included for archiving even if it's ahead of UTC. The range in the future check is now based of the site local time date rather than the UTC date at midnight.

@bx80 bx80 added the Needs Review PRs that need a code review label Feb 28, 2022
@sgiehl
Copy link
Member

sgiehl commented Feb 28, 2022

@bx80 guess that logically makes sense. Would you mind also adding a test case where the site's timezone is behind UTC and UTC time is already a day ahead? Guess in that case it shouldn't create archives as well, right?

@justinvelluppillai justinvelluppillai modified the milestones: 4.8.0, 4.9.0 Mar 1, 2022
Copy link
Member

@sgiehl sgiehl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks fine now and the added tests should cover all cases I think. I did some local testing by creating some sites with timezones far ahead and behind UTC and checked that no archives in the future were created. That all seem to work as expected.

@sgiehl sgiehl removed the Needs Review PRs that need a code review label Mar 3, 2022
@bx80 bx80 merged commit 680db11 into 4.x-dev Mar 8, 2022
@bx80 bx80 deleted the m-18773-do-not-archive-the-future branch March 8, 2022 20:21
@justinvelluppillai justinvelluppillai removed the not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org. label Apr 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Dependent archives show only partial data or no report data
4 participants