Let's assume we have the search url on the marketplace containing a &
in search query (see #14715):https://plugins.matomo.org/search?query=Heatmap+%26+session+recording&post_type=product
Currently Matomo tracker js would run a safeDecodeWrapper
on it. So the resulting url, that would be used for tracking would behttps://plugins.matomo.org/search?query=Heatmap+&+session+recording&post_type=product
, which is actually not the same
fixes #14715
We might need to check if there's any other way to fix this as this was added because of https://github.com/matomo-org/matomo/issues/8413 in https://github.com/matomo-org/matomo/pull/8510
Anyway, let's for now focus on Matomo 4 release etc.
@tsteur This might actually fix some more issues that are related to encoding and page charset.
It shouldn't make any problems like #8413 or #8510, as before decodeURIComponant
was used, which throwed the error. Now the url would not be decoded. I'm actually still wondering why that should be needed at all. Are there any cases where the location
object might hold any values that are double encoded?
@sgiehl I think the original urldecode was added because of https://github.com/matomo-org/matomo/issues/7220 and maybe also https://github.com/matomo-org/matomo/issues/7218 . If we now removed that method, we'd have a problem that URL's would be likely stored URL encoded in the actions meaning it will create a new ID action for every URL and it can cause a lot of problems eg if some reports don't group by URL etc. Might need to test this but there's a chance we can't merge this.
I guess since #7220 a lot has happened in browser development. Seems the url and referrer are no longer url encoded in modern browsers
@sgiehl do you have maybe some reference to that where browsers changed that? I just tested with some URLs and it seems to be still URL encoded. I would assume it could break a lot of things if browsers simply changed that but it's possible they did.
eg viewing /test.html?x=34L#$3L<a href='/4'>#4</a>L1>124:56:7&test
then some characters were encoded
Ok. Now I understand what you are referring to. That's still normal that some special characters are encoded. But it's actually not the full url. So we shouldn't manipulate the complete url using any decodeWrapper.
Let's assume you request something like test.html?q=test%26q%3Dp
, which is a valid request with one query parameter.
After decoding it you get test.html?q=test&q=p
, which is obviously something else.