New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
delete old log_* table data also without idvisit #17964
Conversation
You'll need to apply this patch, otherwise it won't get into the code. See https://github.com/matomo-org/test-examples/blob/trackingformlogs/logdeleter/general/index.php . diff --git a/plugins/PrivacyManager/PrivacyManager.php b/plugins/PrivacyManager/PrivacyManager.php
index f11ff26b43..98055c9150 100644
--- a/plugins/PrivacyManager/PrivacyManager.php
+++ b/plugins/PrivacyManager/PrivacyManager.php
@@ -672,6 +672,7 @@ class PrivacyManager extends Plugin
$deleteIntervalDays = $settings[$setting];
$deleteIntervalSeconds = $this->getDeleteIntervalInSeconds($deleteIntervalDays);
+ return true;
if ($lastDelete === false ||
$lastDelete === '' ||
((int)$lastDelete + $deleteIntervalSeconds) <= time()
|
c41ff89
to
ac94c39
Compare
@geekdenz Wouldn't it make sense to add/change some tests in LogDeleterTest to prove that the new logic works as expected. Ideally the new/changed tests would fail on 4.x-dev but work with the changes of this PR. |
@sgiehl it's not really needed since the logic we use itself is already tested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just tested it locally, seems to work as expected 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@geekdenz it seems like maybe some other methods can now be removed as well like Rawlogdao::deleteFromLogTable
?
The only other thought I have was that before we were doing idvisit IN(1,2,3,4,4)
and now we use idvisit=1 or idvisit = 2 or idvisit = 3
. I believe this is though the same for the query optimiser. Might be something to quickly double check / look into.
Also before we were doing idvisit in (1,2,3,4)
and now we use placeholders using ?
. This is a regression and we should change the below query in DataSubjects
to not use placeholders. When hundreds or thousands of placeholders are used then MySQL can get very very slow. We learned this over the years. We don't need to bind the parameters there if we cast the value to an integer.
will be more efficient, though just by a constant factor ( |
OK. Double checked manually and got expected results: mysql> SELECT distinct b.idvisit, c.visit_last_action_time, a.idlogform, a.idformview, a.field_name FROM matomo_log_form_field as a join matomo_log_form as b on a.idlogform=b.idlogform JOIN matomo_log_visit AS c ON c.idvisit=b.idvisit;
+---------+------------------------+-----------+------------+------------+
| idvisit | visit_last_action_time | idlogform | idformview | field_name |
+---------+------------------------+-----------+------------+------------+
| 10444 | 2020-08-01 00:00:00 | 24 | zaODNJ | input1 |
| 10444 | 2020-08-01 00:00:00 | 24 | zaODNJ | input2 |
| 10444 | 2020-08-01 00:00:00 | 24 | zaODNJ | input3 |
| 10449 | 2020-08-01 00:00:00 | 28 | 9GlRNX | input1 |
| 10449 | 2020-08-01 00:00:00 | 28 | 9GlRNX | input2 |
| 10449 | 2020-08-01 00:00:00 | 28 | 9GlRNX | input3 |
| 10443 | 2021-09-07 03:05:54 | 23 | ZDa9lV | input1 |
| 10443 | 2021-09-07 03:05:54 | 23 | ZDa9lV | input2 |
| 10443 | 2021-09-07 03:05:54 | 23 | ZDa9lV | input3 |
+---------+------------------------+-----------+------------+------------+
9 rows in set (0.01 sec)
mysql> SELECT distinct b.idvisit, c.visit_last_action_time, a.idlogform, a.idformview, a.field_name FROM matomo_log_form_field as a join matomo_log_form as b on a.idlogform=b.idlogform JOIN matomo_log_visit AS c ON c.idvisit=b.idvisit;
+---------+------------------------+-----------+------------+------------+
| idvisit | visit_last_action_time | idlogform | idformview | field_name |
+---------+------------------------+-----------+------------+------------+
| 10443 | 2021-09-07 03:05:54 | 23 | ZDa9lV | input1 |
| 10443 | 2021-09-07 03:05:54 | 23 | ZDa9lV | input2 |
| 10443 | 2021-09-07 03:05:54 | 23 | ZDa9lV | input3 |
+---------+------------------------+-----------+------------+------------+
3 rows in set (0.01 sec) |
Generally tested for quite a while and looks all good to me. Waiting for automated tests to finish before merging 🎉 |
Thanks @tsteur for the added test and info! |
fixes #16529
Description:
Reuse logic from DataSubject and refactor into core.
To test this manually, please see https://github.com/matomo-org/test-examples/blob/trackingformlogs/logdeleter/general/index.php . You can configure it with https://github.com/matomo-org/test-examples/blob/trackingformlogs/config/config.example.php .
Review