Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wrong results for getHostName(url) in piwik.js #19634

Open
volker-attempto opened this issue Aug 16, 2022 · 1 comment
Open

wrong results for getHostName(url) in piwik.js #19634

volker-attempto opened this issue Aug 16, 2022 · 1 comment
Labels
Bug For errors / faults / flaws / inconsistencies etc.

Comments

@volker-attempto
Copy link

I experienced, that the tracked hostname was wrong when query parameters are provided, which include the @ (at) symbol.

Expected Behavior

hostname is extracted correctly from href / url

Current Behavior

string after last @-sign is treated as domain name

Possible Solution

fix the RegExp in piwik.js

Steps to Reproduce (for Bugs)

  1. set up tracking script with a url like 'http://www.example.org:3000/passwort-zuruecksetzen?email=email@example.com&code=844815'
  2. in matomo backend you can see 'example.com&code=844815' as hostname

To reproduce, I copied the getHostName function from piwik.js into seperate file:

function getHostName(url) {
            // scheme : // [username [: password] @] hostname [: port] [/ [path] [? query] [# fragment]]
    var e = new RegExp('^(?:(?:https?|ftp):)/*(?:[^@]+@)?([^:/#]+)'), matches = e.exec(url);

            return matches ? matches[1] : url;
        }


console.log(1, getHostName('https://www.example.org'));
console.log(2, getHostName('https://www.example.org?code=1234'));
console.log(3, getHostName('https://user:passwd@www.example.org?code=1234'));
console.log(4, getHostName('https://user:passwd@www.example.org/bla?code=1234'));
console.log(5, getHostName('http://www.example.org:3000/passwort-zuruecksetzen?email=email@example.com&code=844815'));
console.log(6, getHostName('http://www.example.org:3000/passwort-zuruecksetzen?email=email%40example.com&code=844815'));
console.log(7, getHostName('http://user:pass@www.example.org:3000/passwort-zuruecksetzen?email=email@example.com&code=844815'));

output should IMO always be 'www.example.org'
node hostname.js

1 www.example.org
2 www.example.org?code=1234
3 www.example.org?code=1234
4 www.example.org
5 example.com&code=844815
6 www.example.org
7 www.example.org

Context

Your Environment

  • Matomo Version: 4.10.1
  • PHP Version: not relevant
  • Server Operating System: not relevant
  • Additionally installed plugins: not relevant
  • Browser: Brave Version 1.42.86 Chromium: 104.0.5112.81 (Official Build) (arm64)](https://brave.com/latest/)
  • Operating System: MacOS 12.5
@volker-attempto volker-attempto added the Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. label Aug 16, 2022
@justinvelluppillai
Copy link
Contributor

Great bug report, thanks! We can improve this regex which is also currently used in optOut.js. I will put it in the queue for prioritisation by the product team.

@justinvelluppillai justinvelluppillai added Bug For errors / faults / flaws / inconsistencies etc. and removed Potential Bug Something that might be a bug, but needs validation and confirmation it can be reproduced. labels Aug 17, 2022
@justinvelluppillai justinvelluppillai added this to the For Prioritization milestone Aug 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug For errors / faults / flaws / inconsistencies etc.
Projects
None yet
Development

No branches or pull requests

2 participants