@volker-attempto opened this Issue on August 16th 2022

I experienced, that the tracked hostname was wrong when query parameters are provided, which include the @ (at) symbol.

Expected Behavior

hostname is extracted correctly from href / url

Current Behavior

string after last @-sign is treated as domain name

Possible Solution

fix the RegExp in piwik.js

Steps to Reproduce (for Bugs)

  1. set up tracking script with a url like 'http://www.example.org:3000/passwort-zuruecksetzen?email=email@example.com&code=844815'
  2. in matomo backend you can see 'example.com&code=844815' as hostname

To reproduce, I copied the getHostName function from piwik.js into seperate file:

function getHostName(url) {
            // scheme : // [username [: password] @] hostname [: port] [/ [path] [? query] [# fragment]]
    var e = new RegExp('^(?:(?:https?|ftp):)/*(?:[^@]+@)?([^:/#]+)'), matches = e.exec(url);

            return matches ? matches[1] : url;
        }

console.log(1, getHostName('https://www.example.org'));
console.log(2, getHostName('https://www.example.org?code=1234'));
console.log(3, getHostName('https://user:passwd<a class='mention' href='https://github.com/www'>@www</a>.example.org?code=1234'));
console.log(4, getHostName('https://user:passwd<a class='mention' href='https://github.com/www'>@www</a>.example.org/bla?code=1234'));
console.log(5, getHostName('http://www.example.org:3000/passwort-zuruecksetzen?email=email<a class='mention' href='https://github.com/example'>@example</a>.com&code=844815'));
console.log(6, getHostName('http://www.example.org:3000/passwort-zuruecksetzen?email=email%40example.com&code=844815'));
console.log(7, getHostName('http://user:pass<a class='mention' href='https://github.com/www'>@www</a>.example.org:3000/passwort-zuruecksetzen?email=email<a class='mention' href='https://github.com/example'>@example</a>.com&code=844815'));

output should IMO always be 'www.example.org'
node hostname.js

1 www.example.org
2 www.example.org?code=1234
3 www.example.org?code=1234
4 www.example.org
5 example.com&code=844815
6 www.example.org
7 www.example.org

Context

Your Environment

  • Matomo Version: 4.10.1
  • PHP Version: not relevant
  • Server Operating System: not relevant
  • Additionally installed plugins: not relevant
  • Browser: Brave Version 1.42.86 Chromium: 104.0.5112.81 (Official Build) (arm64)](https://brave.com/latest/)
  • Operating System: MacOS 12.5
@justinvelluppillai commented on August 17th 2022 Member

Great bug report, thanks! We can improve this regex which is also currently used in optOut.js. I will put it in the queue for prioritisation by the product team.

Powered by GitHub Issue Mirror