GDPR: When "Pseudonimise User ID" is activated, still allow users to export data subjects requests for a given User ID #12839

mattab · 2018-05-07T03:44:06Z

Our User ID Pseudonymisation does a SHA1 and uses the Matomo salt during hashing. (refs #12836 #12641 #12600)

It is not full anonymisation because given the User ID, and knowing the Matomo salt, then it's possible to find back all the visits for this particular user. In the backend, since we know the Salt, we're in theory able to process the User ID hash (Pseudonym) and return all visits/actions data for this User ID only. The goal of this issue is to implement this behavior for full transparency towards data subjects.

Exporting data subjects data based on User ID

Current behavior

Currently, when User ID Pseudonymisation is activated on the instance, all User IDs are replaced by the hashed value. And exporting the data subject's data won't work because User ID is now hashed/pseudonymised.

Expected behavior

when Pseudonymisation of User IDs is enabled, and when the Matomo Super User exports data subject requests for a given User ID,
then the data export should work.

tsteur · 2018-05-07T04:27:28Z

Do I understand this correctly that you want the export to work when searching for the original userId? I'm not getting why you would then anonymize it in the first place. I don't think this should be possible.

mattab · 2018-05-07T05:03:00Z

We could delete the feature, but It's definitely still useful to have this feature, because if one steals the DB, or steals access to the UI, one can't find out the original User ID (which brings several layers of security). So the idea is to be 100% transparent about it. Since it's technically possible (and relatively easy) to re-identify a User ID that was pseudonymised, then we should expose this through the UI by letting data subjects data export still work.

tsteur · 2018-05-07T05:20:42Z

one can't find out the original User ID (which brings several layers of security)

That's not really true. You can still find out with brute force etc... doesn't take too long nowadays. Especially when that visitor has been on the site before logged in as user.

tsteur · 2018-05-07T18:50:17Z

@mattab lets discuss later again. As enabling this feature turns the "Pseudo Anonymization" into "Pseudo Secure".

It implies that something is anonymized, but the data is still accessible. Even an attacker who has the token_auth would still be able to get the data through the API etc. And it may be more likely that an attacker gets access to token_auth than DB.

It is very important to clarify this feature before the 3.5.0 release as you cannot change the meaning of it from "Pseudo Anonymization" to "Pseudo Secure" later as this is not what users that have this enabled would expect.

tsteur · 2018-05-07T18:59:57Z

Also once an attacker has access to the db, the attacker has access to the API being able to query the API for specific users.

On top, having the userId salt in the DB is not helping as it is neither secure nor anonymized.

mattab · 2018-05-07T22:15:34Z

It implies that something is anonymized, but the data is still accessible.

FYI the word is "Pseudonym-ise" where pseudonym means A pseudonym (/ˈsjuːdənɪm/ SEW-də-nim) or alias (/ˈeɪliəs/) is a name that a person or group assumes for a particular purpose, which can differ from their first or true name. So there is no notion of anonymity or security.

mattab added Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc. c: Privacy For issues that impact or improve the privacy. labels May 7, 2018

mattab mentioned this issue May 7, 2018

Do not use same userId hash on same site #12840

Closed

innocraft-automation added this to the For Prioritization milestone Jan 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GDPR: When "Pseudonimise User ID" is activated, still allow users to export data subjects requests for a given User ID #12839

GDPR: When "Pseudonimise User ID" is activated, still allow users to export data subjects requests for a given User ID #12839

mattab commented May 7, 2018 •

edited

tsteur commented May 7, 2018

mattab commented May 7, 2018

tsteur commented May 7, 2018

tsteur commented May 7, 2018

tsteur commented May 7, 2018

mattab commented May 7, 2018

GDPR: When "Pseudonimise User ID" is activated, still allow users to export data subjects requests for a given User ID #12839

GDPR: When "Pseudonimise User ID" is activated, still allow users to export data subjects requests for a given User ID #12839

Comments

mattab commented May 7, 2018 • edited

Exporting data subjects data based on User ID

Current behavior

Expected behavior

tsteur commented May 7, 2018

mattab commented May 7, 2018

tsteur commented May 7, 2018

tsteur commented May 7, 2018

tsteur commented May 7, 2018

mattab commented May 7, 2018

mattab commented May 7, 2018 •

edited