@robocoder opened this Issue on July 4th 2010 Contributor

Do we properly handle:

  • malformed URLs, eg http:domain and http:/domain (which are auto-corrected by some browsers)
  • trailing period on tld, eg http://domain.tld./path
@mattab commented on July 12th 2010 Member

To test this, I would recommend modifying one of the Main.test.php tracker setup and call setUrl() with the malformed URLs.

@mattab commented on July 22nd 2010 Member

If we change this, we will have several formats of URLs coming in from several versions of the Tracker API clients (http://piwik.org/docs/tracking-api/)

Goal of this ticket would be to not trigger the mod_security error when a URL is found in a GET param. dreamhost (or is it bluehost) can disable the mod_security on demand.

@mattab commented on January 17th 2011 Member

I tried with the patch:

### Eclipse Workspace Patch 1.0
#P trunk
Index: tests/integration/Main.test.php
--- tests/integration/Main.test.php (revision 3762)
+++ tests/integration/Main.test.php (working copy)
@@ -258,7 +258,7 @@
         $visitorB->setUrlReferer( '' );
        $visitorB->setUserAgent('Opera/9.63 (Windows NT 5.1; U; en) Presto/2.1.1');
-       $visitorB->setUrl('http://example.org/products');
+       $visitorB->setUrl('http:/example.org/products');
        $this->checkResponse($visitorB->doTrackPageView('first page view'));

        // -
@@ -277,7 +277,7 @@
         $visitorAsite2 = $this->getTracker($idSite2, Piwik_Date::factory($dateTime)->addHour(24)->getDatetime(), $defaultInit = true);
         $visitorAsite2->setUserAgent('Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0;)');
-        $visitorAsite2->setUrl('http://example2.com/home');
+        $visitorAsite2->setUrl('http://example2.com./home');
         $this->checkResponse($visitorAsite2->doTrackPageView('Website 2 page view'));

         // Returning visitor on Idsite 2 1 day later, one page view, with chinese referer

and I see that

  • malformed URL: they are tracked, but are split by the / separator after http: and also split by the first / after hostname, instead of showing the first category by default.
  • extra dot in domain name handled correctly, ie. tracked as URL for the requested page independently of whether the dot is in there or not

I don't think we need to handle malformed URL since it could show a bug in the website (ie. wrong link somewhere) or on a partner/campaign href.

This Issue was closed on January 17th 2011
Powered by GitHub Issue Mirror