@bx80 opened this Pull Request on September 19th 2021 Contributor

Description:

Fixes #11627

(This is a re-submission of PR 17887 to work around merge failure)

Adds a new option to the JavaScript tracker to exclude query parameters from the tracked URL.

_paq.push(["setExcludedQueryParams", ["uid", "sid"]]);

Existing purify() and removeUrlParameter() functions were reused.

The tracker code generator has been updated to automatically add this option to the generated javascript tracking code using the excluded query parameters configured under Websites > Settings.

Tracker code generator unit tests have also been updated to test for valid tracker code and string escaping with the option enabled.

(Initially used "setIgnoreUrlParams" as the option name as per issue report, but though it would be more consistent to use "setExcludedQueryParams" to match the existing config option and UI terminology.)

Review

@bx80 commented on September 19th 2021 Contributor

build js

@sgiehl commented on September 20th 2021 Member

If I see that correctly the Tracking Code Generator currently automatically adds a setExcludedQueryParams to the tracking code if any excluded params are defined globally or for the website. I'm wondering if it might make sense to make it possible to adjust that value (only within the code generator). @tsteur do you think we should add that?

@justinvelluppillai commented on September 20th 2021 Contributor

@sgiehl I don't see where it gets the excluded params for the website - looks like it only gets the global ones currently?

@tsteur commented on September 20th 2021 Member

@sgiehl not sure. I'm kind of thinking we wouldn't need to expose this to the tracker code generator UI at all as it's rarely needed maybe. I'm not sure if we should expose the global list of excluded URL parameters. Not sure if there could be any risk with this or not. I suppose it might not be. Generally, there is also the list of excluded parameters from the config url_query_parameter_to_exclude_from_url=gclid,fbclid,fb_xd_fragment,fb_comment_id,phpsessid,jsessionid,sessionid,aspsessionid,doing_wp_cron,sid,pk_vid which maybe would need to be added automatically too. However, excluding some parameters like gclid,fbclid can break functionality.

@mattab what are your thoughts? The above logic implements that global excluded config parameter gets exposed in the tracking code generator. Say you have configured this:
image

then it adds below code _paq.push(["setExcludedQueryParams", ["foo","bar"]]);.

image

Having the excludedParameters API parameter in the tracking code generator is 👍 but I would maybe not add any global excluded parameter by default. I would probably also not add a UI to enter extra parameters to exclude as it's simply needed so rarely. On the other side someone could argue when they copy/paste the tracking code they can still remove the line for removing URL parameters if they didn't want to expose this information but often users wouldn't even know what it is.

Note that the global excluded parameters list supports REGEXP which I think the JS logic isn't.

To have a bit more security by default I would exclude these parameters by default phpsessid,jsessionid,sessionid,aspsessionid. There's also sid|zenid etc but these may be used for other purposes in an application meaning we cannot exclude them by default.

@bx80 commented on September 20th 2021 Contributor

If we're not wanting to expose the excluded parameters in the generated tracking code then maybe we could hash the values when generating the tracking code and then have purify() check those hashes against the hashed value of the parameters? That way it wouldn't be possible for users to just read off a list of interesting parameters from the tracking code.

@tsteur commented on September 20th 2021 Member

I reckon with rainbow tables etc the original value could be often found out quite quickly. See internal slack we would for now go only with exposing the website specific excluded parameters.
That means we probably can't exclude any other parameters like phpsessid,jsessionid,sessionid,aspsessionid by default which is fine.

If someone needs to exclude parameters then they can still exclude them manually by adjusting the tracking code.

@sgiehl commented on September 21st 2021 Member

Is there actually any need to include the server side excluded parameters into the tracking code? Imho we could simply not include them there at all. The server side exclusion will still work the same...
But having the option in the tracker nevertheless might make sense, as it allows to remove parameters from being sent along with the tracking request.

@tsteur commented on September 21st 2021 Member

@sgiehl we don't need it there. In the issue back then I mentioned we could do it. I reckon now that it's already it can be a good benefit that we show the site specific excluded parameters automatically. It's good for privacy and security to remove these already in the site.

@bx80 commented on September 22nd 2021 Contributor

build js

@sgiehl commented on September 27th 2021 Member

build js

@sgiehl commented on September 27th 2021 Member

It could make sense to also reflect those changes in the UI tests. We already have a test that shows the tracking code generator here: https://github.com/matomo-org/matomo/blob/03dbff2aaff77b93cef5158e8509f3c370947748/tests/UI/specs/UIIntegration_spec.js#L793-L803
So maybe we could simply add some globally excluded site parameters, so they are displayed within the tests. Guess we could add that code somewhere around here: https://github.com/matomo-org/matomo/blob/8c3f495c64a5209f0110f49588c4840ff2cd07fe/tests/PHPUnit/Fixtures/UITestFixture.php#L107-L110
To ensure some "bad parameters" won't hurt, it might be good to use some xss parameters like:

        \Piwik\Plugins\SitesManager\API::getInstance()->setGlobalExcludedQueryParameters(implode(',', [
            $this->xssTesting->forTwig('excludedParameter'),
            $this->xssTesting->forAngular('excludedParameter'),
        ]));
@sgiehl commented on September 30th 2021 Member

Tried updating the remaining UI tests, but the artifacts upload is currently not working, so I'm not able to check if the failing tests are related. Once the related UI tests are fixed guess this should be ready to merge

This Pull Request was closed on October 4th 2021
Powered by GitHub Issue Mirror