Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds message about data retention on gdpr overview #12878

Merged
merged 7 commits into from May 21, 2018
Merged

Conversation

sgiehl
Copy link
Member

@sgiehl sgiehl commented May 9, 2018

fixes #12864

@sgiehl sgiehl added not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org. Needs Review PRs that need a code review labels May 9, 2018
@sgiehl sgiehl added this to the 3.5.1 milestone May 9, 2018
@tsteur
Copy link
Member

tsteur commented May 9, 2018

FYI: We still have to mention the data retention period they should enter in a privacy notice, consent notice, or privacy policy.

This would be usually whatever is higher - aggregated reports or raw data retention rate. If the delete reports delete all reports except the numeric reports, then we only show the raw/log data retention rate to be used for this. That's because aggregated reports still contain personal information and therefore need to be included in the data retention rate.

@mattab
Copy link
Member

mattab commented May 10, 2018

Maybe we should add a sentence below the bullet points "Data retention policy can be set by a Super User." with the link only there for Super Users.
(IIRC the page "Anonymise data" is not visible to admin users)

@tsteur I would only mention both and let users pick the one they need depending on their needs and requirements. we can't really go into more details for them, because they may not have PII (user id, custom dim) in their aggregate reports and then would only need to mention the raw data retention.

@tsteur
Copy link
Member

tsteur commented May 10, 2018

@tsteur I would only mention both and let users pick the one they need depending on their needs and requirements. we can't really go into more details for them, because they may not have PII (user id, custom dim) in their aggregate reports and then would only need to mention the raw data retention.

the thing be to at least educate them on helping to make a decision. Nobody can realistically know which one to pick so we need to help our users.

@sgiehl
Copy link
Member Author

sgiehl commented May 10, 2018

I'm fine with changing it again, just let me know what exactly to change/add.

@tsteur
Copy link
Member

tsteur commented May 10, 2018

We would only need to add a message. Like "The overall data retention rate for your privacy policy is the raw data retention rate. Please note that aggregated reports may contain personal data as well. If you are using features like User ID, Custom Variables, Custom Dimension, or track personal data through other events such as events, page URLs or page titles, etc, then the overall data retention rate is the data retention rate that is higher. "

In reality, people sometimes track different websites and for some pages, they may have personal data in aggregated reports and for some not.

btw: I don't know if you use "Log data" somewhere, but lets maybe always use raw data instead as this is better understandable for people that don't know our tables etc are called "log" and also in general?

@sgiehl
Copy link
Member Author

sgiehl commented May 13, 2018

we use log data in some other translations in PrivacyManager. Should we replace it with raw data at all occurrences?

@tsteur
Copy link
Member

tsteur commented May 13, 2018

Makes sense I would say @mattab ?

@mattab
Copy link
Member

mattab commented May 13, 2018

Sounds good to replace wording by raw data 👍

</ul>
<p>
<br />
The overall data retention rate for your privacy policy is the raw data retention rate. Please note that aggregated reports may contain personal data as well. If you are using features like User ID, Custom Variables, Custom Dimension, or track personal data through other events such as events, page URLs or page titles, etc, then the overall data retention rate is the data retention rate that is higher.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think it would be clearer here to write "the higher of the two" instead of "the data retention rate that is higher".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another suggestion: "through other events" => "in other ways"

{% if deleteLogsEnable %}
<li>all aggregated reports are deleted after <strong>{{ reportRetention }}</strong>.</li>
{% else %}
<li>aggregated report are <strong>never</strong> deleted.</li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo, should be aggregated reports

{% if deleteReportsEnable %}
<li>all visitors and actions raw data are deleted after <strong>{{ rawDataRetention }}</strong>.</li>
{% else %}
<li>visitors and actions raw data are <strong>never</strong> deleted.</li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think it should be "visit and action raw data", since you can't technically delete the visitor themselves :P

}
if ($purgeDataSettings['delete_reports_older_than'] % 30 > 0) {
$rawDataRetention .= floor($purgeDataSettings['delete_reports_older_than']%30) . ' ' . Piwik::translate('Intl_PeriodDays');
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how beneficial it is, but it would look a bit better to use the singular translation if the year/day/month is 1. (ie, 1 year 1 month instead of 1 years 1 months)

@diosmosis
Copy link
Member

Not sure why but I'm getting weird results when I test this. If I set the "Delete old visitor logs" entry to 180 days & don't delete report data:

image

image

It says it will never delete raw data & will delete report data if > 1 year 5 months:

image

I think for the aggregated data section at least, it's not checking if the report deletion feature is enabled?

@diosmosis
Copy link
Member

If both features are enabled:

image

image

then the gdpr overview says it will delete raw data, but it gets the number wrong (13 days instead of 182, I think it's using the wrong setting):

image

@sgiehl
Copy link
Member Author

sgiehl commented May 20, 2018

@diosmosis I've updated the PR with your suggestions and fixed some stuff.

Copy link
Member

@diosmosis diosmosis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to go 👍 . Not sure if @tsteur or @mattab has to do a final approval since it's GDPR related, so just approving.

@diosmosis
Copy link
Member

Oh forgot about the UI tests, they will need to be updated before merging.

@mattab
Copy link
Member

mattab commented May 21, 2018

Feedback

  • Could you please add a UI test which shows both bullet point values?

Maybe @tsteur you have some more feedback?

@tsteur
Copy link
Member

tsteur commented May 21, 2018

I would change "then the overall data retention rate" => " then the overall data retention rate for your privacy policy ".

For some reason I would have personally listed the raw data retention before the aggregated report but that is totally fine either way.

I don't know if we already have an FAQ, but if so it would be great to link raw data and aggregated data to an FAQ entry so it is clear what they mean. All good though if we don't have them.

@sgiehl
Copy link
Member Author

sgiehl commented May 21, 2018

changed the order and updated the text.
Regarding FAQ: I didn't find some matching entries, but maybe @mattab knows if there are any.

{% if deleteReportsEnable %}
<li>all visits and actions raw data are deleted after <strong>{{ rawDataRetention }}</strong>.</li>
{% else %}
<li>visits and actions raw data are <strong>never</strong> deleted.</li>
{% endif %}
{% if deleteLogsEnable %}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed, but are these variables named correctly? The deleteReportsEnable block talks about raw data & the deleteLogsEnable block talks about aggregated reports.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems I somehow screwed it up 🙈

@diosmosis diosmosis merged commit 05d5b30 into 3.x-dev May 21, 2018
@diosmosis diosmosis deleted the gdprretention branch May 21, 2018 23:13
InfinityVoid pushed a commit to InfinityVoid/matomo that referenced this pull request Oct 11, 2018
* Adds message about data retention on gdpr overview

* Adds additional information about gdpr relevant data retention

* replace log data with raw data

* review adjustments

* adds ui test

* review adjustments

* update UI files
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Needs Review PRs that need a code review not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

In the GDPR Overview display the log data retention policy as configured
4 participants