@mattab opened this Issue on March 13th 2019 Member

This is a frequent bug report. When deleting visits via the GDPR Tools then the deleted visits still show up in the reports.

The goal of this issue is that, whenever some visits are deleted then we also should call the Invalidate Reports code. As a result, next time core:archive will run all of the reports for the invalidated days (and the week, month, year) will be reprocessed, and the deleted data will be reflected in the reports (ie. not visible anymore).

(this will also benefit Cloud customers since invalidating old reports is disabled on Cloud yet sometimes we really need some data to be deleted)

Also in the UI we can remove the text In this case you may want to consider to invalidate reports after deleting these visits. since it will now be done automatically.

@tsteur commented on April 22nd 2019 Member

@mattab do we need to take log deletion setting into consideration? If we invalidate a report, will the invalidated report eventually disappear or do we keep the report if there is no final report? Otherwise we risk some reports cannot be regenerated?

@tsteur commented on April 28th 2019 Member

ping @mattab

@tsteur commented on April 28th 2019 Member

To invalidate, we want to use the method ArchiveInvalidator::rememberToInvalidateArchivedReportsLater($idSite, $date)

@tsteur commented on April 28th 2019 Member

We delete the dates based on log_visit.visit_last_action_time on the visit.

When we pass the $date to the method, need to make sure to apply the site's timezone. Otherwise we might invalidate the wrong date.

@mattab commented on April 29th 2019 Member

do we need to take log deletion setting into consideration? If we invalidate a report, will the invalidated report eventually disappear or do we keep the report if there is no final report? Otherwise we risk some reports cannot be regenerated?

AFAIK the API which invalidates reports checks when the first logs are available and if the logs are deleted for this date, the reports won't be invalidated. So we shouldn't have to worry about it.

@tsteur commented on May 2nd 2019 Member

AFAIK the API which invalidates reports checks when the first logs are available and if the logs are deleted for this date, the reports won't be invalidated. So we shouldn't have to worry about it.

so basically, if someone deletes a visit, and a user expects this data to be deleted from reports, it won't be actually deleted @mattab ? I don't think it's what users expect, at the same time deleting those reports is also not what people expect :)

@mattab commented on May 2nd 2019 Member

My comment was unclear... What i meant is: the API which invalidates reports checks the date-time of the oldest logs and if the date to invalidate reports for is older than this date-time then the reports are not invalidated (because we know there is no logs to re-archive reports). Since when we delete a visit it means there are still logs for this day for sure, it should always result in an invalidation down the line.

@tsteur commented on May 2nd 2019 Member

@mattab that's clear. The problem is when you delete a visit of 12 months ago, and log deletion is set to 6 months... then the archive won't be invalidated/deleted but a user would expect the data to be removed from that archive.

@tsteur commented on May 2nd 2019 Member

It's an edge case though and don't really need to think about it. Be only an issue when deleting visits and for those visits log purging hasn't happened just yet.

Guess it's more like about deleting older reports that contain personal data but that's a different issue.

This Issue was closed on May 3rd 2019
Powered by GitHub Issue Mirror