Our log import (matomo-4.0.5) failed last night with the following error,
ERROR BulkTracking[2020-12-19 07:55:01 UTC] [a1651] Exception: /var/www/viabit.com/analytics/public/core/Tracker/Db/Pdo/Mysql.php(238): Error query: SQLSTATE[22007]: Invalid datetime format: 1366 Incorrect string value: '\xF0\x9F\x92\x95\xF0\x9F...' for column `piwik_viabit`.`log_action`.`name` at row 1 In query: INSERT INTO log_action (name, hash, type, url_prefix) VALUES (?,CRC32(?),?,?) Parameters: array (
ERROR BulkTracking[2020-12-19 07:55:01 UTC] [a1651] 0 => '💕🦸 Pills without prescription on 🎁 www.NetDoctor.store 🎁 Buy Levitra from $0.84/pill 🦸💕Purchase Levitra Canada - Levitra 20 Mg Order Online',
ERROR BulkTracking[2020-12-19 07:55:01 UTC] [a1651] 1 => '💕🦸 Pills without prescription on 🎁 www.NetDoctor.store 🎁 Buy Levitra from $0.84/pill 🦸💕Purchase Levitra Canada - Levitra 20 Mg Order Online',
ERROR BulkTracking[2020-12-19 07:55:01 UTC] [a1651] 2 => 8,
ERROR BulkTracking[2020-12-19 07:55:01 UTC] [a1651] 3 => NULL,
ERROR BulkTracking[2020-12-19 07:55:01 UTC] [a1651] )
ERROR BulkTracking[2020-12-19 07:55:01 UTC] [a1651] Error query: SQLSTATE[22007]: Invalid datetime format: 1366 Incorrect string value: '\xF0\x9F\x92\x95\xF0\x9F...' for column `piwik_viabit`.`log_action`.`name` at row 1 In query: INSERT INTO log_action (name, hash, type, url_prefix) VALUES (?,CRC32(?),?,?) Parameters: array (
ERROR BulkTracking[2020-12-19 07:55:01 UTC] [a1651] 0 => '💕🦸 Pills without prescription on 🎁 www.NetDoctor.store 🎁 Buy Levitra from $0.84/pill 🦸💕Purchase Levitra Canada - Levitra 20 Mg Order Online',
ERROR BulkTracking[2020-12-19 07:55:01 UTC] [a1651] 1 => '💕🦸 Pills without prescription on 🎁 www.NetDoctor.store 🎁 Buy Levitra from $0.84/pill 🦸💕Purchase Levitra Canada - Levitra 20 Mg Order Online',
ERROR BulkTracking[2020-12-19 07:55:01 UTC] [a1651] 2 => 8,
ERROR BulkTracking[2020-12-19 07:55:01 UTC] [a1651] 3 => NULL,
ERROR BulkTracking[2020-12-19 07:55:01 UTC] [a1651] )
These are the two log lines responsible:
www.example.org 83.171.252.205 - - [18/Dec/2020:09:49:03 -0500] "GET /?s=%F0%9F%92%95%F0%9F%A6%B8%20Pills%20without%20prescription%20on%20%F0%9F%8E%81%20www.NetDoctor.store%20%F0%9F%8E%81%20Buy%20Levitra%20from%20%240.84%2Fpill%20%F0%9F%A6%B8%F0%9F%92%95Purchase%20Levitra%20Canada%20-%20Levitra%2020%20Mg%20Order%20Online HTTP/1.1" 200 23328 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0"
www.example.org 83.171.252.205 - - [18/Dec/2020:09:49:04 -0500] "GET /?s=%F0%9F%92%95%F0%9F%A6%B8%20Pills%20without%20prescription%20on%20%F0%9F%8E%81%20www.NetDoctor.store%20%F0%9F%8E%81%20Buy%20Levitra%20from%20%240.84%2Fpill%20%F0%9F%A6%B8%F0%9F%92%95Purchase%20Levitra%20Canada%20-%20Levitra%2020%20Mg%20Order%20Online HTTP/1.1" 200 23328 "-" "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0"
This is probably a duplicate of https://github.com/matomo-org/matomo-log-analytics/issues/293 - did you convert your tables to utf8mb4 already?
@orlitzky - Go to Settings->System check->Database abilities. The system check will tell you if there's something to do and, if yes, how it's done.
Thanks! I've got a huge database, but php ./console core:convert-to-utf8mb4
seems to be working. Christmas is coming, why not add that to core:upgrade
? =)
@orlitzky I'm not 100% sure, but converting the collation of the entire database should not be a forced process connected to the update flow. This should rather be a step after considering potential risks, free db disk space, possible problems with some MySQL versions, etc.
Armchair quarterbacking: regardless of the risks, the log importer and the database have to agree on the character set used in the database. Otherwise, (at best!) the log importer simply wont work -- as in this bug report. Clearly the log importer is designed to feed "real" utf8 into the database... IMO the most sensible way to make them agree is to make the database support it as well.
(You could teach the log importer to convert to utf8mb3, and make everyone use that in their databases, but why?)
Put another way, the current upgrade process is...
That's going to happen anyway, eventually, and the end result is the same. Why not eliminate all those steps?
As far as I know, the conversion is also advertised in the updater somewhere, but is not a mandatory step to do. I'm sure, things will be optimized in the future, based on user feedback
As @peterbo has written. Doing that in a "normal" update is a bit too risky. Also it is not supported by all MySQL versions and depends on some InnoDB settings.
Note: For new installs UTF8mb4 should be automatically used if available.
@orlitzky What database charset did you have set in your config.ini.php
before the utf8mb4 migration?
Note: If it is set to utf8
Matomo should automatically remove/replace unsupported chars before inserting anything in the database. When it's set to utf8mb4
that won't be done, as Matomo assumes the database is able to handle the chars.
I don't have a charset specified in config.ini.php
, but the global mariadb client/server defaults were utf8
until I started dealing with this yesterday.
That's going to happen anyway, eventually, and the end result is the same. Why not eliminate all those steps?
Totally understand where you are coming from and agree. Just fyi wanted to add that most users don't run into this problem because they aren't tracking any special characters or emojis vs it could create problems during the upgrade when forcing utf8mb4.
FWIW my conversion finally finished. I manually invalidated the reports for the 18th, re-archived, and everything is back to normal for now. Thanks again for the hint.
FWIW my conversion finally finished. I manually invalidated the reports for the 18th, re-archived, and everything is back to normal for now. Thanks again for the hint.
Feel free to close :)
I still think the conversion should take place automatically if the log importer relies upon it, but it won't bother me again... so, ok =)
I still think the conversion should take place automatically if the log importer relies upon it, but it won't bother me again... so, ok =)
(Having said I'm not in any way related to this repository ihih)
I think your idea is interesting but not feasible because on medium to big installations updating the database schema may cause a disservice of several minutes. This is an operation that should be executed only by a DBA or by a system administrator using command line access from the Matomo server. On the other hand the log importer can be executed by everyone, just adopting an AUTH token by an user with write
privileges (usually not even a super-administrator) moreover it usually communicates just via HTTP APIs and not via a direct access to the database (and this is a feature - I think - because it allows you to read logs from a machine and send data to whatever Matomo instance).
Oh and thank you again for opening this bug report. You saved me lot of minutes in troubleshooting the same issue today!