Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add methods to remove invalidations. #16400

Merged
merged 6 commits into from Sep 11, 2020
Merged

Conversation

diosmosis
Copy link
Member

Will be used to, for example, remove pending invalidations when a custom report is deleted. This will prevent starting climulti:requests for these custom report IDs.

@diosmosis diosmosis added not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org. Needs Review PRs that need a code review labels Sep 7, 2020
@diosmosis diosmosis added this to the 4.0.0 milestone Sep 7, 2020
@tsteur
Copy link
Member

tsteur commented Sep 7, 2020

LGTM if tests pass @diosmosis

btw are we also deleting them when a site is deleted? Although then maybe they would be simply skipped anyway?

@diosmosis
Copy link
Member Author

Although then maybe they would be simply skipped anyway?

They should be skipped... I'll double check though.

@diosmosis
Copy link
Member Author

@tsteur added a check for whether the site exists to the start of QueueConsumer::getNextArchivesToProcess(). This means it will run before getting a group of archives to process concurrently, so it won't be run too often (just before a batch of climulti:requests starts). Otherwise we'd check for every invalidation we pull from the table. Didn't think it was worth it given how unlikely this is to happen.

@@ -122,6 +123,12 @@ public function __construct(LoggerInterface $logger, $websiteIdArchiveList, $cou

public function getNextArchivesToProcess()
{
// in case a site is deleted while archiving is running
if (!empty($this->idSite) && !$this->isSiteExists($this->idSite)) {
$this->logger->debug("Site ID = {$this->idSite} was deleted during archiving process, moving on.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diosmosis would the invalidation still be removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsteur No. there used to be a scheduled task that did that (removed invalidations that were over a week old), but we removed it for some reason. I can put it back or we can explicitly delete them when removing a site.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can put it back or we can explicitly delete them when removing a site.
sounds good.

Sometimes people also remove sites manually by deleting an entry in the site table but can deal with this later I suppose. Then we'd need to basically on demand remove it here.


private function isSiteExists($idSite)
{
$site = API::getInstance()->getSiteFromId($idSite);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diosmosis can we use the Site class here so it uses cached data? It be fine if only the next cronjob would notice the site was actually removed and that this one would basically "fail" when the site was deleted. Just to save some IO etc. The next cronjob could then make sure to remove the invalidations for that site as then the in-memory cache would be correct

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsteur can't archiving runs take hours and hours? This would mean the in memory cache would be inaccurate for the entire run, correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can but it should be fine as usually also every hour a different job would start and mark them then as deleted basically

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsteur if archive processes never process the same site, then will this still happen? I guess we could do it all at once at the start of a core:archive process, but then for many hours long processes a site could be deleted in the middle of one, and we'd continue archiving...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Each site would be picked only once so a cache wouldn't be too important and in general the extra code for it might not be too important. I think at some point the archiving would simply fail for the site if it's removed in the middle of archiving. This should be fine as it would be a rare case and it would work the next time. I'm really just wanting to make sure we're cleaning up archive_invalidations table when a site is deleted. This could be done like you mentioned in the deleteSite event or so for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add it to a daily task, seems easier to handle all cases that way (ie, look for deleted sites and delete invalidations w/ those idsites).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@diosmosis generally might be could to also do it on deleteSite event if any possible using the same code. This way it be basically instantly applied and we might prevent getting some failed archive emails because of it.

@tsteur
Copy link
Member

tsteur commented Sep 10, 2020

@diosmosis there's a test failing in https://travis-ci.org/github/matomo-org/matomo/jobs/725881962#L738 maybe the delete doesn't work 100% correct

@diosmosis
Copy link
Member Author

failed copy paste

@diosmosis diosmosis merged commit 6b70e8c into 4.x-dev Sep 11, 2020
@diosmosis diosmosis deleted the archive-invalidations-remove branch September 11, 2020 01:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Review PRs that need a code review not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants