@mattab opened this Issue on March 9th 2009 Member

Baidu is the biggest search engine in China and currently Piwik fails detecting keywords from baidu.

Example queries:



Resolving this issue involves writing unit test to cover these bits of code.
Also we should check whether the code path around line 715 in core/Tracker/Visit.php is useful, if not fix it or delete it.

@robocoder commented on March 10th 2009 Contributor

The problems with baidu might be more complex than at first glance:

  • the second url uses the variable name "word" instead of "wd"
  • gb2312 is an encoding; are the keywords not utf-8?
@mattab commented on March 20th 2009 Member

also see #435 which is very related

@mattab commented on March 24th 2009 Member

(In [1014]) - cleaning up the search engine parsing code, adding tests, recording UTF8 keywords in the DB rather than encoded (as tables are now utf8, refs #5730)

  • adding tests in url.test.php and fixed double encoding in some edge cases
  • fixed #589 Piwik fails to properly decode and store some chinese keywords (eg. from baidu.com)
  • fixed #435 Exotic encoded keywords should be stored as utf-8 in the DB
  • refs #575 hopefully fixed, will give it a few days of tests on piwik.org
This Issue was closed on March 24th 2009
Powered by GitHub Issue Mirror