Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GeoIP2 updater might try to download new file before it exists #18427

Open
Findus23 opened this issue Dec 1, 2021 · 5 comments · Fixed by #21468
Open

GeoIP2 updater might try to download new file before it exists #18427

Findus23 opened this issue Dec 1, 2021 · 5 comments · Fixed by #21468
Labels
Bug For errors / faults / flaws / inconsistencies etc.

Comments

@Findus23
Copy link
Member

Findus23 commented Dec 1, 2021

reported in https://forum.matomo.org/t/geoip2autoupdater-failed-to-unzip-the-downloaded-file-is-not-a-valid-geolocation-database/43925 (and also found in my cronjob)

Expected Behavior

Whenever a GeoIP2 update job falls at the start of a month it might be possible that the dbip-city-lite-2021-12.mmdb.gz doesn't exist on the servers and the update fails

Current Behavior

ERROR [2021-12-01 02:03:30] 895774 /var/www/matomo/plugins/GeoIp2/GeoIP2AutoUpdater.php(189): GeoIP2AutoUpdater: failed to unzip '/var/www/matomo/tmp/latest/DBIP-City.mmdb.gz.download' after downloading 'https://download.db-ip.com/free/dbip-city-lite-2021-12.mmdb.gz': The downloaded file is not a valid geolocation database. Please re-check the URL or download the file manually. [Query: , CLI mode: 1]

Possible Solution

In case a 404 is returned, Matomo could fetch the previous months file again or reschedule the job for a few hours later.

@Findus23 Findus23 added the Bug For errors / faults / flaws / inconsistencies etc. label Dec 1, 2021
@tassoman
Copy link
Contributor

tassoman commented Feb 1, 2023

This night, our job tried to unzip until 5AM GMT+1. Maybe we can simply skip the monthly job on 2nd of each month? 🤔

@sgiehl sgiehl added this to the For Prioritization milestone Feb 1, 2023
@PowerKiKi
Copy link
Contributor

Same here, https://download.db-ip.com/free/dbip-city-lite-2023-11.mmdb.gz returns a 404 at the time of writing. Though it will probably work in a few hours...

Moving the cron to the 2nd of the month sound like a good idea to limit those issue without having data that are too stale.

@PowerKiKi
Copy link
Contributor

I created #21468 as a possible fix for this issue.

sgiehl pushed a commit that referenced this issue Nov 2, 2023
When geolocation database updates are configured to be done monthly,
they were scheduled on the 3rd day of _the week_, so Wednesday. For
months that start on a Wednesday, that means that we tried to fetch a
file that might not exist yet.

We now schedule on the 3rd day of _the month_, as I believe was the
original intent of dc98a97 that introduced the behavior a long time
ago. So this gives 3 days to db-ip.com to update their files in all
cases.

Fixes #18427
@sgiehl
Copy link
Member

sgiehl commented Nov 2, 2023

Reopening, as the PR only adjusted the day the download is tried. We still should aim to implement a proper handling when the download fails.

@sgiehl sgiehl reopened this Nov 2, 2023
@tassoman
Copy link
Contributor

tassoman commented Nov 4, 2023

There is no proper handling if the file is simply missing. I think an administrator notification in GUI and an error logged (already exists), they are enough.
Fortunately, older GEOIP data don't gets wiped before new data it's downloaded.
So, nowadays, it's not a real problem and just happens rarely for a few hours.

caddoo pushed a commit that referenced this issue Nov 5, 2023
When geolocation database updates are configured to be done monthly,
they were scheduled on the 3rd day of _the week_, so Wednesday. For
months that start on a Wednesday, that means that we tried to fetch a
file that might not exist yet.

We now schedule on the 3rd day of _the month_, as I believe was the
original intent of dc98a97 that introduced the behavior a long time
ago. So this gives 3 days to db-ip.com to update their files in all
cases.

Fixes #18427
caddoo pushed a commit that referenced this issue Nov 5, 2023
When geolocation database updates are configured to be done monthly,
they were scheduled on the 3rd day of _the week_, so Wednesday. For
months that start on a Wednesday, that means that we tried to fetch a
file that might not exist yet.

We now schedule on the 3rd day of _the month_, as I believe was the
original intent of dc98a97 that introduced the behavior a long time
ago. So this gives 3 days to db-ip.com to update their files in all
cases.

Fixes #18427
caddoo pushed a commit that referenced this issue Nov 5, 2023
When geolocation database updates are configured to be done monthly,
they were scheduled on the 3rd day of _the week_, so Wednesday. For
months that start on a Wednesday, that means that we tried to fetch a
file that might not exist yet.

We now schedule on the 3rd day of _the month_, as I believe was the
original intent of dc98a97 that introduced the behavior a long time
ago. So this gives 3 days to db-ip.com to update their files in all
cases.

Fixes #18427
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For errors / faults / flaws / inconsistencies etc.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants