@sgiehl opened this Pull Request on April 5th 2018 Member

As using GeoIP2 requires a schema change on log tables to be able to track ISO region codes, GeoIP2 will be included as a new plugin. This way we can avoid an automatic schema change while updating Matomo.

This plugin will be automatically enabled for new installs. Instances updating to the new version will have the plugin included disabled. Enabling the plugin will then trigger the schema change.

As long as the GeoIP2 plugin is disabled nothing will change for Geolocation configuration, tracked data and API.

As soon as the GeoIP2 plugin is enabled some stuff changes:

Autoupdating of GeoIP databases
Autoupdating feature for databases will switch to configure updates for GeoIP2 databases.
For existing installs with configured updates for GeoIP, it will still show those settings as read only. This settings will be automatically removed as soon as GeoIP2 updates are configured

Tracking
As soon as a GeoIP2 provider is chosen region codes will be stored as ISO codes. [GeoIP Legacy used FIPS codes]. To convert old log data from FIPS to ISO the command usercountry:convert-region-codes can be used. The command will convert all codes stored before the date of switching to a GeoIP2 provider.

Changes in API
Until a GeoIP2 provider is selected nothing changes and still old FIPS codes and region names will be returned.
After switching to one all old codes included in archives will be converted [FIPS => ISO] if possible and region names will be used accordingly.

Switching back from GeoIP2 to GeoIP is only possible by disabling GeoIP2 plugin. This is on purpose as it may cause problems if FIPS and ISO codes are mixed too much in the database.

replaces #12553
fixes #4487

@diosmosis commented on April 6th 2018 Member

Noticed there's a bug in pear/archive_tar which piwik/decompress uses which causes a fatal error when downloading databases on PHP 7: https://pear.php.net/bugs/bug.php?id=21218

I think matomo/decompress uses the fixed version.

EDIT: one way to fix this is to run composer require pear/archive_tar locally.

@diosmosis commented on April 6th 2018 Member

@mattab tested locally all features seem to work, except unarchiving on PHP 7. Leaving up to you whether my remaining comments should be addressed.

@diosmosis commented on April 8th 2018 Member

@sgiehl Looked again, looks good to me except for php 7 archive tar fatal error (see my comment above for more info).

@sgiehl commented on April 8th 2018 Member

I read it. But that seems not a "bug" related to the changes for GeoIP2, right? It should currently occur for GeoIP Legacy as well I guess.
Maybe we could create a issue to fix that later. I'm not able to handle that the next days, if you see an easy solution feel free to fix it ;)

@diosmosis commented on April 8th 2018 Member

Easiest fix is to run composer require pear/archive_tar locally, I'll do this later today.

@diosmosis commented on April 9th 2018 Member

PHP 7 bug no longer exists after 8fbe252, the offending method is no longer called. Moving back to extractInString fixed this.

@sgiehl commented on April 23rd 2018 Member

I've now tested the maps with geoip2 changes_

Realtime map should work as before, as it doesn't use any region data.

The other maps will lack some information on region level. (Country and City level will still work as before) All countries where the new iso codes do not match any old fips codes the map will show those visits as "unknown region". (Note: US and Canada are not affected, as they used iso codes before, also some countries where the codes didn't change between fips and iso will still show up correctly).

To have working maps on region level with geoip2 we need to create completely new maps that also show correct regions (based in ISO) and then force all location providers to use ISO region codes in order to show region data on the map

@diosmosis commented on May 2nd 2018 Member

Did another code review, left two comments on the conversion command, will do another local test tomorrow.

@mattab commented on May 2nd 2018 Member

Feedback @sgiehl

  • Wizard: maybe we should simply remove the downloaded file if anything fails to download or extract
  • Once the wizard has been used, and the settings saved, we should reload the page, so that it says "Geoip2: installed". otherwise use has to manually refresh the page to actually enable geoip2 which is not obvious.
  • Region may be incorrectly detected. For example for IP address 194.57.91.215 we get the following "Location" result in the geoip2 demo: Besançon, Doubs, Bourgogne-Franche-Comte, France, Europe. The region name is Bourgogne-Franche-Comte. However in our test files we can see that this IP was sset to the region of Doubs which is not the region but the département (there are several departements in each region). -> this seems a big issue/regression that (can be seen in the test output)
  • Let's update the Region.php to define the segment & set the accepted values as ISO code? Currently they read protected $acceptValues = '01 02, OR, P8, etc.<br/>eg. region=A1;country=fr';
  • Getting the following error after enabling Geoip2 The value "" is not a valid IP address. -> as discussed on slack, maybe we should overwrite that method for geoip2 provider, and always use ip from header, without trying to convert it to ipv4 if possible?
@sgiehl commented on May 2nd 2018 Member

Wizard: maybe we should simply remove the downloaded file if anything fails to download or extract

it's not automatically removed when extracting fails

Once the wizard has been used, and the settings saved, we should reload the page, so that it says "Geoip2: installed". otherwise use has to manually refresh the page to actually enable geoip2 which is not obvious.

changed.

Region may be incorrectly detected. For example for IP address 194.57.91.215 we get the following "Location" result in the geoip2 demo: Besançon, Doubs, Bourgogne-Franche-Comte, France, Europe. The region name is Bourgogne-Franche-Comte. However in our test files we can see that this IP was sset to the region of Doubs which is not the region but the département (there are several departements in each region). -> this seems a big issue/regression that (can be seen in the test output)

GeoIP2 returns multiple regions. Currently I'm using the smallest known region that is detected. We could use the largest region detected, which would "fix" your regression for France. BUT it would cause changes for regions in UK. The first regions reported would be England, Ireland, Schottland.

For example the result for 212.44.43.3 for GeoIP Legacy is:

country_code,country_name,city_name,region_code,region_name
GB,United Kingdom,Linton,C3,Cambridgeshire

For GeoIP 2 using the first region it would be

country_code,country_name,city_name,region_code,region_name
GB,United Kingdom,Linton,ENG,England

So not sure how to handle that best... @mattab any thoughts? shall we use the smallest or the biggest detected region?

Let's update the Region.php to define the segment & set the accepted values as ISO code? Currently they read protected $acceptValues = '01 02, OR, P8, etc.
eg. region=A1;country=fr';

changed

Getting the following error after enabling Geoip2 The value "" is not a valid IP address. -> as discussed on slack, maybe we should overwrite that method for geoip2 provider, and always use ip from header, without trying to convert it to ipv4 if possible?

That should be fixed now.

@diosmosis commented on May 3rd 2018 Member

Fixed two minor issues I found, other than that looks good. Couple things I couldn't test:

  • test w/ nginx geoip2 module (didn't have time to configure the module)
  • test w/ Org/ISP databases
@diosmosis commented on May 3rd 2018 Member

Looks like there are a couple system test failures. Once those are fixed I'd say it's ready to merge (after updating screenshots).

@mattab commented on May 3rd 2018 Member

@sgiehl I'm not sure about this, but it seems like it could break a lot... it's quite important to get the region right..

-> Could you maybe compare what we used to have, to the new format, for the top 50 countries or so? Ideally we have a mapping where we know which region level we need to use for a given country, so it's correct for FR, UK, USA, and other 50+ countries...

@diosmosis just sent you the commercial ISP DB for testing

@sgiehl could you please apply pending feedback (test, UI files) and I'll proceed to merge the PR, ideally in 12 hours or so :+1:

@diosmosis commented on May 3rd 2018 Member

@mattab tested w/ isp db, no issues found

@sgiehl commented on May 3rd 2018 Member

-> Could you maybe compare what we used to have, to the new format, for the top 50 countries or so? Ideally we have a mapping where we know which region level we need to use for a given country, so it's correct for FR, UK, USA, and other 50+ countries...

There is no good way to check if it's the same. For almost all countries the region codes and names changed anyway by using ISO instead of FIPS.
For now I've added an exception for UK, so it uses the smallest region in this case. For all other countries it will now use the biggest region.

This Pull Request was closed on May 4th 2018
Powered by GitHub Issue Mirror