Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surveillance of Piwik users via Fingerprint hash and Visitor ID and Visitor device #7667

Closed
mattab opened this issue Apr 12, 2015 · 1 comment
Labels
answered For when a question was asked and we referred to forum or answered it. c: Privacy For issues that impact or improve the privacy. Task Indicates an issue is neither a feature nor a bug and it's purely a "technical" change.

Comments

@mattab
Copy link
Member

mattab commented Apr 12, 2015

The goal of this issue is to expose a Privacy challenge in Piwik, regarding the ability to spy on users tracked in Piwik over time.

What is the Visitor ID?

The unique visitor ID is a 16 characters hexadecimal string. Every unique visitor is assigned a different ID and this ID is not changed after it is assigned.

  • It is stored in the first party cookie. After 13 months after the first action by this user, the ID will be renewed.

The Visitor ID is stored in the Piwik database in the field idvisitor

What is the fingerprinting hash?

When tracking a new user, Piwik processes a fingerprint hash for this user. The hash is built from a list of user attributes such as IP address, screen resolution, browser plugins used, etc. (this is done in the method getConfigHash.). The fingerprint hash is used by Piwik Tracking API to try to record the actions in the correct user visit. The fingerprint hash is used when the Visitor ID (in first party cookie) was not found (otherwise by default the Visitor ID is used).

Notes about fingerprint hash is created:

  • The fingerprint hash is currently seeded with a salt that is different for each Piwik instance.
    • (ensures that a same person tracked in multiple Piwik instances could be not be cross-matched across those several instances. )
  • The fingerprint hash is also seeded with the Website ID (done in User fingerprint hash should be different by default on separate websites #6824)
    • (ensures that a same person tracked on several websites within the Piwik instance could not be cross-matched across several websites within this Piwik instance).

The fingerprint hash is stored in the Piwik database in the field config_id

Privacy challenges

Imagine for example if a Piwik database is seized by ex-colleagues of Edward Snowden (spies) who would like to use the Piwik data to spy on users who were tracked in Piwik.

When seizing a Piwik Database:

  • if IP anonymisation is not enabled, the Piwik DB will give spies the complete trail of user actions on the website for a given 'IP address' or 'Visitor ID'
  • if IP anonymisation is enabled, ability to spy is bit more limited. The Piwik DB will give spies the complete trail of user actions on the website for a given 'Anonymised IP address' or 'Visitor ID'
  • Spies can always lookup all actions for a given 'Visitor ID' assuming:
    • the user had First party cookies enabled.
    • the user was using the same browser over time, and did not delete the cookies
  • Spies can lookup all actions for a user that uses a particular browser, and/or a particular OS, and/or a set of plugins
    • (Piwik stores the browser, OS and plugin info in the tracking log tables)

Improve privacy

Since our goal is to improve the Privacy by default for users being tracked in Piwik (#6160), we wanted to explain how this works.

Note that to improve Privacy in your Piwik server and prevent long term surveillance of users via the Piwik database, you can already do the following:

To help limit surveillance we should work on: #5907

Maybe there isn't much more we can do but feel free to leave a comment if you have suggestions.

@mattab mattab added this to the Mid term milestone Apr 12, 2015
@mattab mattab added Task Indicates an issue is neither a feature nor a bug and it's purely a "technical" change. c: Privacy For issues that impact or improve the privacy. labels Apr 12, 2015
@mattab mattab changed the title Surveillance of Piwik users via Fingerprint hash and Visitor ID Surveillance of Piwik users via Fingerprint hash and Visitor ID and Visitor device Apr 13, 2015
@mattab mattab modified the milestones: Long term, Mid term Dec 5, 2016
@mattab
Copy link
Member Author

mattab commented Dec 29, 2016

I've documented in detail how the visitor recognition works here and in this FAQ: How does Piwik detect unique and returning visitors? (with User ID, Visitor ID from cookie and/or fingerprint)

Our privacy guidelines are documented in: https://piwik.org/docs/privacy/

Any further request or comment please comment here or create a new issue.

See also our Privacy label for issues: https://github.com/piwik/piwik/labels/c%3A%20Privacy

@mattab mattab closed this as completed Dec 29, 2016
@mattab mattab added the answered For when a question was asked and we referred to forum or answered it. label Dec 29, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
answered For when a question was asked and we referred to forum or answered it. c: Privacy For issues that impact or improve the privacy. Task Indicates an issue is neither a feature nor a bug and it's purely a "technical" change.
Projects
None yet
Development

No branches or pull requests

1 participant