Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance in deleteLogDataForDeletedSites when deleting data for more than one site #18618

Closed
Starker3 opened this issue Jan 14, 2022 · 1 comment · Fixed by #19014
Labels
c: Performance For when we could improve the performance / speed of Matomo.
Milestone

Comments

@Starker3
Copy link
Contributor

Refs DEV-2528

We have a user report concerning slow performance for the deleteLogDataForDeletedSites scheduled task when more than one measurable has been deleted.
One of the queries that took really long to execute was the following:

DELETE log_form_field FROM matomo_log_form_field log_form_field LEFT JOIN matomo_log_form log_form ON log_form_field.idlogform = log_form.idlogform LEFT JOIN matomo_log_visit log_visit ON log_form.idvisit = log_visit.idvisit WHERE log_visit.idsite in (15,11)

This ran for over an hour and a half before they decided to kill it and instead run the following query for each ID site individually:

MariaDB [matomo]> DELETE log_form_field FROM matomo_log_form_field log_form_field LEFT JOIN matomo_log_form log_form ON log_form_field.idlogform = log_form.idlogform LEFT JOIN matomo_log_visit log_visit ON log_form.idvisit = log_visit.idvisit WHERE log_visit.idsite = 15;
Query OK, 0 rows affected (4.264 sec)

MariaDB [matomo]> DELETE log_form_field FROM matomo_log_form_field log_form_field LEFT JOIN matomo_log_form log_form ON log_form_field.idlogform = log_form.idlogform LEFT JOIN matomo_log_visit log_visit ON log_form.idvisit = log_visit.idvisit WHERE log_visit.idsite = 11;
Query OK, 0 rows affected (0.010 sec)

As you can see the above queries for each individual site executed very quickly.

They tried again after this to manually execute the query that was previously running for both ID sites, after 10 minutes they killed that manual query again because it was clearly a lot slower than just doing it individually for each ID site.

@Starker3 Starker3 added the Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. label Jan 14, 2022
@bx80
Copy link
Contributor

bx80 commented Jan 17, 2022

Looking at the query plan differences between the two statements, the IN (15,11) clause is causing the use of a join buffer. Since this is an infrequently executed task, a simple solution would be to adjust the deleteLogDataForDeletedSites task to execute individual delete statements for each site.

@bx80 bx80 added Bug For errors / faults / flaws / inconsistencies etc. and removed Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. labels Jan 17, 2022
@tsteur tsteur added c: Performance For when we could improve the performance / speed of Matomo. and removed Bug For errors / faults / flaws / inconsistencies etc. labels Jan 17, 2022
@tsteur tsteur added this to the 4.8.0 milestone Jan 17, 2022
@tsteur tsteur modified the milestones: 4.8.0, 4.9.0 Feb 9, 2022
@justinvelluppillai justinvelluppillai changed the title deleteLogDataForDeletedSites slow when deleting data for more than one site Improve performance in deleteLogDataForDeletedSites when deleting data for more than one site Apr 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: Performance For when we could improve the performance / speed of Matomo.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants