Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

signs instead of unicode #9078

Closed
galr52 opened this issue Oct 22, 2015 · 19 comments
Closed

signs instead of unicode #9078

galr52 opened this issue Oct 22, 2015 · 19 comments
Labels
worksforme The issue cannot be reproduced and things work as intended.

Comments

@galr52
Copy link

galr52 commented Oct 22, 2015

I installed Piwik on windows server 2008 r2 on iis 7.5, my piwik version is 2.12.1
I started to tracking some sites and it works perferct. a few days ago i added a new site to track and then i started to deal with a strange problem, with the new site only, some of the data is saved as "xx_x_xxx_xx" instead of unicode (hebrew). i noticed that there are some specific characters that the piwik does not recognize and therefore the entire data gets lost.

** all my tracking site are hebrew **
i tried to replace the font - as showen here: http://piwik.org/faq/how-to-install/faq_142/

i tried to change the collaction in the db to "utf8_unicode_ci" .

i tried to create a new site with form and input and the problem occuars in this site too

the code look like this:

<!-- piwik tracking code -->
<!-- End Piwik code -->

<form action="" method="get">
     <input name="q" type="text" />
     <input type="submit" value="search" />
</form>

i added a photo from the "piwik_log_action" that shows the problem.

the red rectangels are the problem, the green rectangles are ok.

Image of Yaktocat
http://imageshack.com/a/img633/8540/bSPlrC.jpg

thanks in advance,

Gal

@sgiehl
Copy link
Member

sgiehl commented Oct 22, 2015

Please try updating to latest 2.15. That should at least fix the issue that some data is not lost. See #8765 / #7766

@galr52
Copy link
Author

galr52 commented Oct 22, 2015

i forgot to mention but i just update to 2.14.3

@galr52
Copy link
Author

galr52 commented Oct 22, 2015

i update to 2.15. from here: https://github.com/piwik/piwik

and the problem wasnt solved :(

@sgiehl
Copy link
Member

sgiehl commented Oct 22, 2015

Wasn't solved for new entries, aswell? Already broken entries won't get fixed later...

@galr52
Copy link
Author

galr52 commented Oct 22, 2015

yes, new entries also shows as "xx_$$XX_X" (signs)...

@sgiehl
Copy link
Member

sgiehl commented Oct 22, 2015

Could you check in your browsers dev tools which request params are send to the piwik tracking?
Would be good to know, which characters can't be printed there...

@galr52
Copy link
Author

galr52 commented Oct 22, 2015

what parametere do you look for?

i sent the data over the url, for example: "http://localhost/Home?q=בוהן"

will save in the db as "xxxx_".

@sgiehl
Copy link
Member

sgiehl commented Oct 22, 2015

Ok. We'll need to test that. Maybe it's related to #8790

@galr52
Copy link
Author

galr52 commented Oct 22, 2015

i put every letter separately to check what are the problematic letters:

Unicode code point character UTF-8 (hex) name works?
U+05D0 א d7 90 HEBREW LETTER ALEF no
U+05D1 ב d7 91 HEBREW LETTER BET yes
U+05D2 ג d7 92 HEBREW LETTER GIMEL yes
U+05D3 ד d7 93 HEBREW LETTER DALET yes
U+05D4 ה d7 94 HEBREW LETTER HE yes
U+05D5 ו d7 95 HEBREW LETTER VAV yes
U+05D6 ז d7 96 HEBREW LETTER ZAYIN yes
U+05D7 ח d7 97 HEBREW LETTER HET yes
U+05D8 ט d7 98 HEBREW LETTER TET yes
U+05D9 י d7 99 HEBREW LETTER YOD yes
U+05DA ך d7 9a HEBREW LETTER FINAL KAF no
U+05DB כ d7 9b HEBREW LETTER KAF yes
U+05DC ל d7 9c HEBREW LETTER LAMED not existes
U+05DD ם d7 9d HEBREW LETTER FINAL MEM not existes
U+05DE מ d7 9e HEBREW LETTER MEM not existes
U+05DF ן d7 9f HEBREW LETTER FINAL NUN not existes
U+05E0 נ d7 a0 HEBREW LETTER NUN yes
U+05E1 ס d7 a1 HEBREW LETTER SAMEKH yes
U+05E2 ע d7 a2 HEBREW LETTER AYIN yes
U+05E3 ף d7 a3 HEBREW LETTER FINAL PE yes
U+05E4 פ d7 a4 HEBREW LETTER PE yes
U+05E5 ץ d7 a5 HEBREW LETTER FINAL TSADI yes
U+05E6 צ d7 a6 HEBREW LETTER TSADI yes
U+05E7 ק d7 a7 HEBREW LETTER QOF yes
U+05E8 ר d7 a8 HEBREW LETTER RESH yes
U+05E9 ש d7 a9 HEBREW LETTER SHIN yes
U+05EA ת d7 aa HEBREW LETTER TAV yes

@galr52
Copy link
Author

galr52 commented Oct 28, 2015

is there any progress?

i want to try debuggin the problem, what function get the request? (or how should i start?)

thanks a lot!

@mattab
Copy link
Member

mattab commented Oct 29, 2015

Hi @galr52 maybe try to enable debugging on the tracker: http://developer.piwik.org/api-reference/tracking-api#debugging-the-tracker

Maybe you find some interesting information?

@galr52
Copy link
Author

galr52 commented Nov 2, 2015

after i debug it i found that the problem start at piwik/plugins/Actions/Actions/ActionSiteSearch.php row 231 - $parsedUrl = @parse_url($originalUrl);

the request is ok until it parse it and then some unicode characters get messy.

any idea?

@galr52
Copy link
Author

galr52 commented Nov 2, 2015

i solved the problem by adding this function to the ActionSiteSearch.php:

function safe_urlencode($txt){
  $result = preg_replace_callback("/[^-\._~:\/\?#\\[\\]@!\$&'\(\)\*\+,;=]+/",
    function ($match) {
      return rawurlencode($match[0]);
    }, $txt);
  return ($result);
}

and call this function before parsing the url

@mattab
Copy link
Member

mattab commented Nov 2, 2015

@galr52 can you paste here the exact URL that was $originalUrl ? we wil try to reproduce the issue and add an automated test for this to make sure it does not break in the future

@galr52
Copy link
Author

galr52 commented Nov 3, 2015

http://localhost:20891/Home/Index?q=שום+דבר+לא+יעצור+אותי
or
http://localhost:20891/Home/Index?q=חלת+בצל

any url with the "not exists" character will end up as "xx_xxxx_Xx_".

when i use the function above all works perfect in any languege i tested.

i had the same problem in the Action>Pages so i added a call to the function in the init function

in piwik/core/Tracker.php

    private function init()
    {
        $this->handleFatalErrors();
        // Call to safe_urlencode
        $_GET['url'] = this->safe_urlencode($_GET['url']);

        if ($this->isDebugModeEnabled()) {
            ErrorHandler::registerErrorHandler();
            ExceptionHandler::setUp();
            Common::printDebug("Debug enabled - Input parameters: ");
            Common::printDebug(var_export($_GET, true));
        }
    }

@mattab
Copy link
Member

mattab commented Nov 18, 2015

I cannot reproduce this issue in Piwik 2.15.0 - can someone else reproduce this?

@mattab
Copy link
Member

mattab commented Nov 26, 2015

It works for me to track http://localhost:20891/Home/Index?q=שום+דבר+לא+יעצור+אותי - both appear well in Page URLs (when I replace q= with hello=) and in Site Search reports.

@mattab mattab closed this as completed Nov 26, 2015
@mattab mattab added the worksforme The issue cannot be reproduced and things work as intended. label Nov 26, 2015
@galr52
Copy link
Author

galr52 commented Dec 3, 2015

:( ok thanks for your help

@saqib16
Copy link

saqib16 commented Apr 12, 2017

Hi,

We have upgraded to Piwik 3.0.2 and PHP 7.0.16 and getting following error:

Error in Piwik (tracker): Error query: SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xD0_\xD0\xBB\xD0\xB5...' for column 'name' at row 1 In query: INSERT INTO piwik_log_action (name, hash, type, url_prefix) VALUES (?,CRC32(?),?,?) Parameters: array ( 0 => '/products/Счетчики Ð_лектроÑ_нергии/?cid=91701', 1 => '/products/Счетчики Ð_лектроÑ_нергии/?cid=91701', 2 => 1, 3 => 0, )

Other characters are been shown fine in dashboard but this error is appearing in PHP error log ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
worksforme The issue cannot be reproduced and things work as intended.
Projects
None yet
Development

No branches or pull requests

4 participants