@mattab opened this Issue on May 22nd 2020 Member

With regards to privacy, considering we have a column latitude and longitude in the database schema, how could we (or how do we) ensure Latitude and Longitude columns are at best to the city level?

It is a privacy concern that lat/long could be more precise than what might be expected.
In upcoming recommendations it will be important to limit geolocation to the city level at best.
afaik we use lat/long in order to plot the user on the real-time map.
independantly of whether the user would be geo-located using anonymised IP or not, it'd be great to ensure the lat/long are never too precise.

Is this already the case in Matomo? If not, could we limit lat/long precision to the city (and how)?

@mattab commented on May 22nd 2020 Member

Also it'd be important to document this "feature" in the user guide at: https://matomo.org/docs/geo-locate/

@diosmosis commented on May 22nd 2020 Member

Possible solutions:

  • rounding lat/long values to some degree (the degree to be determined later)
  • keeping a db of city => lat/long pairs, though this seems far more difficult
@Findus23 commented on May 24th 2020 Member

See also https://github.com/matomo-org/matomo/issues/12735 for an even rougher rounding.

@tsteur commented on September 3rd 2020 Member

I just checked and both DB-IP and MaxMind seem to report the last three digits as 000 and basically round. This can change though in the future.

Also I'm thinking the rounding can still be a problem for rural areas where only few people live. You could then potentially maybe still identify individuals or households maybe?

I'm not sure we can generally find a solution to this besides optionally not tracking it at all (which breaks real time map only). If I see this correct only the real time map uses this info. Maybe the real time map could be changed to work like the regular visitor map and not use long/lat?

@diosmosis commented on September 3rd 2020 Member

I'm not sure, but it looks like the visitor map converts a city to a lat/long pair, so this might be do-able pretty easily... still checking though

@diosmosis commented on September 4th 2020 Member

@tsteur nvm, that uses the tracked longitude/latitude. Probably easiest is to somehow map locations to longitude/latitude, otherwise I think we'd have to change the realtime map significantly. It's probably fairly simple to write a script to iterate over every location in the geoip database and set a lat/long in a file.

@mattab commented on September 4th 2020 Member

I'd say that for their own reasons, it's always in the geolocation DB providers interest to not provide more accurate lat/long.

geoip says for example https://www.maxmind.com/en/geoip2-city

Longitude (Latitude and Longitude are often near the center of population. These values are not precise and should not be used to identify a particular address or household.)

As a possible fix maybe we could always set the last 3 digits to zero if that's what maxmind does (in case they change it in the future)?

@tsteur commented on September 7th 2020 Member

I reckon in this case for now we maybe don't need to do anything and if someone wants to use some more accurate provider then they can do this.

The problem would still remain with rounding etc if locations where only few people live but I suppose they would also be maybe assigned to a bigger nearby city (would need to be checked).

Powered by GitHub Issue Mirror