Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delayed rearchiving #16702

Merged
merged 6 commits into from Nov 13, 2020
Merged

Delayed rearchiving #16702

merged 6 commits into from Nov 13, 2020

Conversation

diosmosis
Copy link
Member

Description:

Refs #16396

Schedule rearchiving calls in a list and apply later in core:archive. CC @tsteur

Review

  • Functional review done
  • Usability review done (is anything maybe unclear or think about anything that would cause people to reach out to support)
  • Security review done see checklist
  • Code review done
  • Tests were added if useful/possible
  • Reviewed for breaking changes
  • Developer changelog updated if needed
  • Documentation added if needed
  • Existing documentation updated if needed

@diosmosis diosmosis added not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org. Needs Review PRs that need a code review labels Nov 11, 2020
@diosmosis diosmosis added this to the 4.0.0-RC milestone Nov 11, 2020
Copy link
Member

@sgiehl sgiehl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess the advantage of this changes is that it won't take longer to perform any action that created dozens of invalidations before, as those records are now created while archiving, right?

*
* @param int|int[]|'all' $idSites
* @param string $pluginName
* @param string|null $report
* @param Date|null $startDate
*/
public function reArchiveReportSafely($idSites, string $pluginName, string $report = null, Date $startDate = null)
public function scheduleReArchiving($idSites, string $pluginName, string $report = null, Date $startDate = null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess we need to adjust that in the plugins as well once that is merged

'ex' => $ex,
]);
} finally {
$reArchiveList->remove([$entry]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to remove the record even if it failed to create the invalidation? Shouldn't we keep it and try to add it another time?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No preference from my side here as I reckon usually it shouldn't fail? if there may be some reasons why it might happen could also keep the entry and retry later

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should only fail if the input is broken for some reason, this is mostly to replicate the old behavior where if it failed for some reason, invalidations wouldn't be added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that the entries are json encoded, that needs to be done for the removal as well. Otherwise it won't be removed at all. Alternatively you could also use another variable for the decoded entry...

Suggested change
$reArchiveList->remove([$entry]);
$reArchiveList->remove([json_encode($entry)]);

try {
list($idSites, $pluginName, $report, $startDateTs) = $entry;

$this->reArchiveReport($idSites, $pluginName, $report, Date::factory($startDateTs));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should only create date instance if startDateTs is not empty cause otherwise it might result in an error or creates a 'now' date or something else unexpected?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

{
try {
$this->reArchiveReport($idSites, $pluginName, $report, $startDate);
$reArchiveList = new ReArchiveList($this->logger);
$reArchiveList->add([$idSites, $pluginName, $report, $startDate ? $startDate->getTimestamp() : null]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally would prefer if we used an array like ['idsites' => $idSites, ...] this would allow us to quite easily extend or change the structure in the future if/when needed. and makes it maybe also more clear what each array item represents? Not too important though. It be mostly an issue if we eg ever removed for example an argument and then would need to do things like [$idSites, $pluginName, null, null, $newArgument] or something

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

use Piwik\Concurrency\DistributedList;
use Psr\Log\LoggerInterface;

class ReArchiveList extends DistributedList
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's great this was so easy to do and that we have this distributedlist already!

core/CronArchive.php Outdated Show resolved Hide resolved
Copy link
Member

@tsteur tsteur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diosmosis ideally we also adjust removeInvalidationsSafely to additionally remove any entry from the list that matches the given pluginName (and maybe idsites)? just to prevent a race condition when quickly activating and deactivating would cause the invalidations to be added after the plugin was deactivated. although maybe these invalidations would be ignored anyway when the plugin is no longer active?

@diosmosis
Copy link
Member Author

ideally we also adjust removeInvalidationsSafely to additionally remove any entry from the list that matches the given pluginName (and maybe idsites)?

👍

@diosmosis
Copy link
Member Author

@tsteur Applied pr fixes.

@tsteur
Copy link
Member

tsteur commented Nov 12, 2020

I'll be mostly in meetings tmrw. @sgiehl could you test this tomorrow? and then we adjust the plugins that use this in activate/deactivate plugin

foreach ($entries as $index => $entry) {
$entry = @json_decode($entry, true);
if (empty($entry)) {
continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't it make sense to always remove entries that can't be decoded? Otherwise they might kept in the list forever

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

$entry['idSites'],
$entry['pluginName'],
$entry['report'],
!empty($entry['startDate']) ? Date::factory($entry['startDate']) : null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we maybe cast $entry['startDate'] to integer before passing it to the factory. It should normally already be an integer, but if it would be a string (containing a number), the factory will throw an exception

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@sgiehl
Copy link
Member

sgiehl commented Nov 12, 2020

tested that locally with adjusted Cohorts PR. Besides the comments I've added and #16705 it seems to work as expected.

@tsteur
Copy link
Member

tsteur commented Nov 13, 2020

👍 @diosmosis so seems good to merge once tests pass?

@diosmosis diosmosis merged commit 9a9981d into 4.x-dev Nov 13, 2020
@diosmosis diosmosis deleted the delayed-rearchiving branch November 13, 2020 06:32
@sgiehl
Copy link
Member

sgiehl commented Nov 13, 2020

Guess you missed that comment: #16702 (comment)
Will create a PR to fix that

@sgiehl sgiehl mentioned this pull request Nov 13, 2020
9 tasks
@diosmosis
Copy link
Member Author

Guess you missed that comment: #16702 (comment)

yes, missed it, the tab that had this github page didn't update I guess

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Review PRs that need a code review not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants