Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Geolocation: German State/ Province huge "Unknown" entries because of wrong languages (db-ip.com) #19329

Closed
utrautmann opened this issue Jun 9, 2022 · 11 comments
Labels
Bug For errors / faults / flaws / inconsistencies etc. duplicate For issues that already existed in our issue tracker and were reported previously.

Comments

@utrautmann
Copy link

The Region widget for State/Province shows a lot of unknown entries, although city entries could be localized.
The free database is used:
https://download.db-ip.com/free/dbip-city-lite-2022-06.mmdb.gz

The screenshots show the big difference between localized visitors for Regions and for Cities.
grafik
grafik

Current Behavior

I have checked this database resp. the CSV file from db-ip.com with the entries for State/Province.
The State/Province entry is sometimes for German regions in German language (Baden-Württemberg) or sometimes in English language (Lower Saxony, Hesse, Bavaria).

It seems to be that whenever the German provinces are in English language in the db-ip.com-Database, there will be an Region entry "Unknown" for the visitor in Matomo .

That might be a content related error at dp-ip.com, but my question is: how does Matomo work here?
What does Matomo need and are there any workarounds?
Needs Matomo the provinces in the language of the city?

  • Matomo Version: 4.10.1
@utrautmann utrautmann added the Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. label Jun 9, 2022
@sgiehl
Copy link
Member

sgiehl commented Jun 10, 2022

Hi @utrautmann
That's kind of a problem with the DB-IP files. The free versions do only include the name of a region, but not the regions ISO code.
Matomo internally stores ISO codes only. If there is no ISO code provided by the geoip lookup, but a region name, we try to lookup the ISO code based on the country code and the region name. For that we are using a generated list of codes and names, that contains the names in it's countries language. See https://github.com/matomo-org/matomo/blob/4.x-dev/plugins/GeoIp2/data/isoRegionNames.php

So actually english names for german regions won't be mapped correctly.

@utrautmann
Copy link
Author

Hi @sgiehl ,
thank you for your answer. This helps me further understand how it works. I have now asked db-ip.com whether there are plans to provide the regions (provinces) in the national language again.

In Matomo, would it be possible that in the absence of an ISO code in the geolocation database, the region information would be used unchanged in Matomo, or would that have other implications?

@sgiehl
Copy link
Member

sgiehl commented Jun 11, 2022

@utrautmann We are storing ISO codes in the database and the database field won't be able to store more than 3 characters. So we can't simply store the name.
We could maybe try to additionally check for the english name of a region 🤔

@utrautmann
Copy link
Author

@sgiehl
Here is the answer from db-ip.com.
It sounds like an approach that's the opposite of Matomo's.

"The IP to City Lite database uses English names when possible, or the local name if no translation is available."

@sgiehl
Copy link
Member

sgiehl commented Jun 13, 2022

Interesting. So you actually can't rely on what they would return, as you can't know what is translated or not 🤷
Still wondering why the can't simply include the ISO code 🙈
Anyway. The problem is, that we would need to fetch the english translations additionally, so we can compare that against the english names as well... And depending from where they get their translations they might even differ from the ones we might fetch 🤔

@sgiehl sgiehl added Bug For errors / faults / flaws / inconsistencies etc. and removed Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. labels Jun 13, 2022
@sgiehl sgiehl added this to the For Prioritization milestone Jun 14, 2022
@utrautmann
Copy link
Author

@sgiehl:
I asked again. Here is the answer from db-ip.com:
"Unfortunately ISO codes and Geoname IDs are not present in the Lite databases and we have no plans to add these fields in the future".

Anyway. The problem is, that we would need to fetch the english translations additionally, so we can compare that against the english names as well... And depending from where they get their translations they might even differ from the ones we might fetch

First of all, I don't think that's a bad idea. Of course, it's possible that the English translation has an error, but in that case there's really not much you can do.

@sgiehl
Copy link
Member

sgiehl commented Jun 20, 2022

@justinvelluppillai We maybe should consider planning this one in soon. Using the lite version of db-ip.com, which we suggest by default, currently might report a lot unknown regions, due to an not map-able region name.
We currently have a command to update the list of regions, which we should run to update the official region names. In addition we should extend the command so it also tries to fetch the english name for each region. This could afterwards be used to check the region names coming from db-ip.com.

Another option might be to send an official request regarding including iso region codes into the lite database coming from Matomo. We might be a big referral for them, so maybe they would reconsider their plans 🤔

@dantefr
Copy link

dantefr commented Jun 21, 2022

Hi,
The same problem exists with two French regions.
DB-IP -> MATOMO
Normandy -> Normandie
Brittany -> Bretagne

I'm not sure this is the right place, but there are too much dashes in the names of three French regions.
DB-IP -> MATOMO
Pays de la Loire -> Pays-de-la-Loire
Provence-Alpes-Côte d'Azur -> Provence-Alpes-Côte-d’Azur
Grand Est -> Grand-Est

I modified the "isoRegionNames.php" file, to have no more unknowns.

@MatomoForumNotifications

This issue has been mentioned on Matomo forums. There might be relevant details there:

https://forum.matomo.org/t/how-to-set-geolocation-to-wider-region-instead-of-city-suburb/46341/2

@MatomoForumNotifications

This issue has been mentioned on Matomo forums. There might be relevant details there:

https://forum.matomo.org/t/problem-with-geography-cities-regions/47960/4

@sgiehl
Copy link
Member

sgiehl commented Mar 28, 2023

closing this one in favor of #20527

@sgiehl sgiehl closed this as not planned Won't fix, can't repro, duplicate, stale Mar 28, 2023
@sgiehl sgiehl removed this from the For Prioritization milestone Mar 28, 2023
@sgiehl sgiehl added the duplicate For issues that already existed in our issue tracker and were reported previously. label Mar 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For errors / faults / flaws / inconsistencies etc. duplicate For issues that already existed in our issue tracker and were reported previously.
Projects
None yet
Development

No branches or pull requests

4 participants