@ethitter opened this Issue on April 25th 2015

WordPress 4.2 was released this week, and it includes full support for emoji, including in post titles and URLs. To take advantage of that, I published a post that used the 💥 emoji (https://s.w.org/images/core/emoji/72x72/1f4a5.png, in case it get's stripped out) in the title and URL, however Piwik failed to track any views of the post because piwik.php is returning a 400 - bad request status code. I confirmed against two other tracking systems that there were views to the post that should've been captured by Piwik.

Is this a problem with the DB encoding (utf8 vs utf8mb4) or an issue in the PHP handling of the title and URL inputs when they include extended UTF-8 characters?

@mattab commented on May 22nd 2015 Member

Hi @ethitter
Thanks for the report. I can confirm the URLs with Emoji are not tracked. Likely this is due to the fact that we would have to change the mysql tables from utf8 to utf8mb4. Note: Wordpress devs blogged about this change in: https://make.wordpress.org/core/2015/04/02/the-utf8mb4-upgrade/
It looks non trivial so we unfortunately can't do it soon.

@sgiehl commented on May 22nd 2015 Member

Shouldn't we do a 'quickfix' so that those urls will still be tracked. Maybe with the emoji cut off?

@ethitter commented on May 23rd 2015

Also relevant is Andrew Nacin's discussion of the security issues around these changes: https://www.youtube.com/watch?v=yQaRUEwEKxE. Simply updating the table encodings may not be sufficient; it wasn't for WordPress.

I like the idea of a quickfix to just drop the emoji, but that'd likely break the URLs being tracked as emoji would need to be stripped from there too.

@mattab commented on September 11th 2015 Member

Shouldn't we do a 'quickfix' so that those urls will still be tracked. Maybe with the emoji cut off?

@sgiehl makes sense, it would be better to track partial incorrect data rather than no data at all. Maybe instead of removing emojis, we could replace with *** or so. feel free to investigate if you have time :)

@mattab commented on September 15th 2015 Member

FYI: Piwik now tracks URLs with emojis but emoji (and all utf8 4-byte chars) will be replaced by � character. it was done in #8765

@mattab commented on September 15th 2015 Member

This is fixed. Created: #8790 Tracking API: track Emoji correctly in page URLs and others

@gmariani commented on May 9th 2018

Still having this issue with 3.4.0 on PHP 7.2

[09-May-2018 14:11:33 UTC] Error in Matomo: Your Matomo version 3.4.0 is up to date.
[09-May-2018 14:11:43 UTC] Error in Matomo (tracker): Error query: SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xF0\x9F\x8F\xA1 C...' for column 'name' at row 1 In query: INSERT INTO piwik_log_action (name, hash, type, url_prefix) VALUES (?,CRC32(?),?,?) Parameters: array ( 0 => '� Chandler Arizona Luxury Homes | [John Cunningham 2018]', 1 => '� Chandler Arizona Luxury Homes | [John Cunningham 2018]', 2 => 4, 3 => NULL, )

@mattab commented on May 9th 2018 Member

@gmariani as it is not supposed to trigger an error, could you please paste in a new issue (this one is already closed), the piwik.php?.... request that creates this error? We will make sure to address this. Thanks

This Issue was closed on September 15th 2015
Powered by GitHub Issue Mirror