setDocumentTitle() expects an un-encoded string because piwik.js uses encodeURIComponent to encode parameters in the request.
the sanitizeInputValue is called for all input values, generally very often, I think charset detection is pretty slow...
is the piwik.js fix not enough to get the page titles right?
I'll rework it.
(In ) refs #2185 - revert r4080
(In ) refs #2185 - sanitizeInputValue() returned '' if input wasn't valid UTF-8
The attached patch moves html_entity_code() back to sanitizeInputValue(), and tries to detect/fix double encoding.
I'll come back to this problem after I've thought more about the implications are.
do we need to handle this use case though? It has never been a problem so far, and I really don't want to complicate the sanitize function because it is heavily used, and security related. It must stay simple and fast. So I vote for updating the doc and clarify that we don't accept encoded values, and leave the sanitize as is on trunk (with your new test in the function)