@yhlin00001 opened this Issue on September 13th 2018

Wimdows 2016 IIS + PHP 7.1.10
If the page url is chinese, it will display garbled in visit record page and export to json/xml (but data is correct in db).

Hiw can I solve it ?

@yhlin00001 commented on September 13th 2018

in download
image

in visit record
image

@tsteur commented on September 13th 2018 Member

Can you maybe paste the path part of the URL in here? This will make it easier to reproduce so we can copy/paste. Cheers

@yhlin00001 commented on September 13th 2018
@yhlin00001 commented on September 13th 2018

And piwik can not export to json format data if piwik had this kind of visit record .
We tested in matomo 3.2.0 & 3.5.1.

@yhlin00001 commented on September 15th 2018

I think, I have solved this issue. But please help me to check if it will have any site-effect. Thanks

in PageUrl.php, I change as the follow:

public static function reconstructNormalizedUrl($url, $prefixId)
{
    $map = array_flip(self::$urlPrefixMap);

    if ($prefixId !== null && isset($map[$prefixId])) {
        $fullUrl = $map[$prefixId] . $url;
    } else {
        $fullUrl = $url;
    }

    // Clean up host & hash tags, for URLs
    // YH
        $fullUrl = urlencode($fullUrl);
    $parsedUrl = <a class='mention' href='https://github.com/parse_url'>@parse_url</a>($fullUrl);
    // YH
        $parsedUrl[path] = urldecode($parsedUrl[path]);
        $parsedUrl[query] = urldecode($parsedUrl[query]);
    echo '--parseUrl_1===';
    print_r($parsedUrl);
    $parsedUrl = PageUrl::cleanupHostAndHashTag($parsedUrl);
    echo '--parseUrl_2===';
    print_r($parsedUrl);
    $url       = UrlHelper::getParseUrlReverse($parsedUrl);

    if (!empty($url)) {
        echo '--url='.$url.'<br/>';
        return $url;
    }

    echo '--fullUrl='.$fullUrl.'<br/>';
    return $fullUrl;
}
@yhlin00001 commented on September 15th 2018

Sorry, this is correct code:

public static function reconstructNormalizedUrl($url, $prefixId)
{
    $map = array_flip(self::$urlPrefixMap);

    if ($prefixId !== null && isset($map[$prefixId])) {
        $fullUrl = $map[$prefixId] . $url;
    } else {
        $fullUrl = $url;
    }

    // Clean up host & hash tags, for URLs
    // YH
        $fullUrl = urlencode($fullUrl);
    $parsedUrl = <a class='mention' href='https://github.com/parse_url'>@parse_url</a>($fullUrl);
    // YH
        $parsedUrl[path] = urldecode($parsedUrl[path]);
        $parsedUrl[query] = urldecode($parsedUrl[query]);
    $parsedUrl = PageUrl::cleanupHostAndHashTag($parsedUrl);
    $url       = UrlHelper::getParseUrlReverse($parsedUrl);

    if (!empty($url)) {
        return $url;
    }

    return $fullUrl;
}
@fdellwing commented on September 15th 2018 Contributor

Best would be you create a PR its easier to read and test.

@tsteur commented on September 16th 2018 Member

FYI: Path and query is not defined in above code. The method seems like a good place where it may be buggy since it is used by both the downloads and the visitor details report.

@yhlin00001 commented on September 21st 2018

finally, we replace parse_url with the following method:
public static function mb_parse_url($url)
{
$enc_url = preg_replace_callback(
'%[^:/@?&=#]+%usD',
function ($matches)
{
return urlencode($matches[0]);
},
$url
);

    $parts = parse_url($enc_url);

    if($parts === false)
    {
        throw new \InvalidArgumentException('Malformed URL: ' . $url);
    }

    foreach($parts as $name => $value)
    {
        $parts[$name] = urldecode($value);
    }

    return $parts;
}
Powered by GitHub Issue Mirror