New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exclude some url parameters from referrer urls #15905
Conversation
b95515b
to
7344e55
Compare
fyi @sgiehl adding a few more
plugins/Referrers/Columns/Base.php
Outdated
} | ||
if (!empty($this->referrerHost) && strpos($this->referrerHost, 'bing.com') !== false) { | ||
$parametersToExclude[] = 'cvid'; | ||
$parametersToExclude[] = 'refig'; | ||
$parametersToExclude[] = 'elv'; | ||
$parametersToExclude[] = 'plvar'; | ||
$parametersToExclude[] = 'setlang'; | ||
$parametersToExclude[] = 'qs'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure you want to exclude setlang
and cc
? those are set to iso country and language codes based on the users selection. As bing is not using country domains that might be the only way to get that info (if it's relevant for someone)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should be able to see this based on geolocation and detected language I suppose. Removed it cause was seeing heaps of referrers from different bing countries / languages. Then each referrer url had like only 1 visit and weren't visible vs if we remove these parameters, they would have had maybe a count of 4 or 5 and then these referrers would have become visible in the reports because they represent same search term. It's really about trying to group as many referrers as meaningful together as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. But then we could also group together all google tlds 🤷
Actual I'm wondering if that is the correct way. We are manipulating the tracked referrer and might cut of information that might possibly be useful for anyone.
Where exactly does it cause memory problems having too many different referrer urls? Wondering if it wouldn't be enough to do the referrer url manipulation only while archiving the reports and keep storing the original referrer in the database for live reports.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It causes issues in MySQL... like queries taking 6 hours to run because they do group by "referer_type", "referer_name", "referer_keyword", "referer_url"
resulting in millions of rows vs now we're like doing with 30K rows.
We are manipulating the tracked referrer and might cut of information that might possibly be useful for anyone.
Sure some might be interested, and they could maybe write a plugin. But it would be a very small percentage that is impacted by this likely vs say 95% would profit from it. Eventually we will be even cutting out a lot more by default eg for privacy reason as otherwise a lot of personal data is tracked requiring tracking consent see #15426
It's also making the reports more useful as more information is grouped together and things become more accurate. Super important referrers suddenly show up in reports (we can see the effect already) that were invisible before.
AFAIK there wouldn't be much of a way for users to utilise the full referrer url currently anyway? Except for maybe segmenting?
So far it really helps with memory, performance, correctness of data, and even improves privacy.
Of course if needed can keep some parameters like lang or country if really needed, just makes the result maybe less accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. makes sense. Should we implement a possibility to disable that manipulation?
If not, feel free to review and merge if everything is fine now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can always add a setting later if / when needed. Not wanting to add too many settings that may not be needed.
Cheers @sgiehl |
fixes #15902