@anonymous-piwik-user opened this Issue on November 5th 2011

File: SearchEngines.php

Original (shows incorrect encoding):
// Mail.ru
'go.mail.ru' => array('Mailru', 'q', 'search?q={k}', 'windows-1251'),

I changed to:
// Mail.ru
'go.mail.ru' => array('Mailru', 'q', 'search?rch=e&q={k}'),

And now it seems to work correctly.

@robocoder commented on November 6th 2011 Contributor

(In [5413]) fixes #2761 - confirmed that go.mail.ru search results are now utf-8

@kiav commented on January 18th 2012

As for now, Mail.ru uses UTF-8 in most cases. But rarely it still uses windows-1251 too.

I had to change extractSearchEngineInformationFromUrl function in /core/Common.php

if(function_exists('iconv')
    && isset($searchEngines[$refererHost][3]))
{
    // accepts string, array or comma separated list string in preferred order
    if (!is_array($searchEngines[$refererHost][3]))
        $charsets = explode(',', $searchEngines[$refererHost][3]);
    else
        $charsets = $searchEngines[$refererHost][3];

    if(!empty($charsets))
    {
        $charset = mb_detect_encoding($key, $charsets);
        if ($charset === false)
            $charset = $charsets[0];

        $newkey = <a class='mention' href='https://github.com/iconv'>@iconv</a>($charset, 'UTF-8//IGNORE', $key);
        if(!empty($newkey))
        {
            $key = $newkey;
        }
    }
}

It works with

'go.mail.ru' => array('Mailru', 'q', 'search?q={k}', array('UTF-8', 'windows-1251')),

in /core/DataFiles/SearchEngines.php

@robocoder commented on January 18th 2012 Contributor

Thanks for the patch.

I don't think we need to support comma separated list. We do have to check for mbstring and have a unit test.

@kiav commented on January 18th 2012

Comma separated list is already supported by mb_detect_encoding.

By the way, mb_strtolower is already used in Common.php (in original Piwik code in the extractSearchEngineInformationFromUrl function) without any checks tests.

@robocoder commented on January 18th 2012 Contributor

Can you provide a sample referrer url with windows-1251 encoding?

I've done some refactoring and added some more tests, but can never have enough.

@kiav commented on January 18th 2012
@robocoder commented on January 18th 2012 Contributor

Awesome! Thanks!

@robocoder commented on January 18th 2012 Contributor

(In [5682]) fixes #2761

This Issue was closed on January 18th 2012
Powered by GitHub Issue Mirror