New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add methods to remove invalidations. #16400
Conversation
LGTM if tests pass @diosmosis btw are we also deleting them when a site is deleted? Although then maybe they would be simply skipped anyway? |
They should be skipped... I'll double check though. |
@tsteur added a check for whether the site exists to the start of QueueConsumer::getNextArchivesToProcess(). This means it will run before getting a group of archives to process concurrently, so it won't be run too often (just before a batch of climulti:requests starts). Otherwise we'd check for every invalidation we pull from the table. Didn't think it was worth it given how unlikely this is to happen. |
@@ -122,6 +123,12 @@ public function __construct(LoggerInterface $logger, $websiteIdArchiveList, $cou | |||
|
|||
public function getNextArchivesToProcess() | |||
{ | |||
// in case a site is deleted while archiving is running | |||
if (!empty($this->idSite) && !$this->isSiteExists($this->idSite)) { | |||
$this->logger->debug("Site ID = {$this->idSite} was deleted during archiving process, moving on."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@diosmosis would the invalidation still be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tsteur No. there used to be a scheduled task that did that (removed invalidations that were over a week old), but we removed it for some reason. I can put it back or we can explicitly delete them when removing a site.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can put it back or we can explicitly delete them when removing a site.
sounds good.
Sometimes people also remove sites manually by deleting an entry in the site table but can deal with this later I suppose. Then we'd need to basically on demand remove it here.
|
||
private function isSiteExists($idSite) | ||
{ | ||
$site = API::getInstance()->getSiteFromId($idSite); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@diosmosis can we use the Site
class here so it uses cached data? It be fine if only the next cronjob would notice the site was actually removed and that this one would basically "fail" when the site was deleted. Just to save some IO etc. The next cronjob could then make sure to remove the invalidations for that site as then the in-memory cache would be correct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tsteur can't archiving runs take hours and hours? This would mean the in memory cache would be inaccurate for the entire run, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can but it should be fine as usually also every hour a different job would start and mark them then as deleted basically
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tsteur if archive processes never process the same site, then will this still happen? I guess we could do it all at once at the start of a core:archive process, but then for many hours long processes a site could be deleted in the middle of one, and we'd continue archiving...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. Each site would be picked only once so a cache wouldn't be too important and in general the extra code for it might not be too important. I think at some point the archiving would simply fail for the site if it's removed in the middle of archiving. This should be fine as it would be a rare case and it would work the next time. I'm really just wanting to make sure we're cleaning up archive_invalidations table when a site is deleted. This could be done like you mentioned in the deleteSite event or so for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add it to a daily task, seems easier to handle all cases that way (ie, look for deleted sites and delete invalidations w/ those idsites).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@diosmosis generally might be could to also do it on deleteSite event if any possible using the same code. This way it be basically instantly applied and we might prevent getting some failed archive emails because of it.
@diosmosis there's a test failing in https://travis-ci.org/github/matomo-org/matomo/jobs/725881962#L738 maybe the delete doesn't work 100% correct |
failed copy paste |
Will be used to, for example, remove pending invalidations when a custom report is deleted. This will prevent starting climulti:requests for these custom report IDs.