Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic Referrer Spam Blacklist #2891

Closed
anonymous-matomo-user opened this issue Jan 28, 2012 · 2 comments
Closed

Dynamic Referrer Spam Blacklist #2891

anonymous-matomo-user opened this issue Jan 28, 2012 · 2 comments
Labels
Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc. wontfix If you can reproduce this issue, please reopen the issue or create a new one describing it.

Comments

@anonymous-matomo-user
Copy link

Referrer spam is a menace today, but simply blacklisting the URLs or the IP addresses will bound to fail. How about a dynamic blacklist?

This is how I foresee it to work. Essentially, there is a "score table" to keep scores of the domain of each referring URL. The better behaved the URL, the higher the score. The worse, the lower.

  1. The score table is initialized with several friendly search engines, custom URLs (specified by the admin), and the host URL, with some high number (or allow the admin to customize). URLs in the whitelist will be considered safe.
  2. Anti-flood: If a referrer URL is referred to for X number of times within Y seconds, penalize the domain's score by Z. Allow X, Y, and Z to be customized.
  3. For each a referrer URL not existing in the score table OR with score < X, retrieve HTML page from the referring URL using curl (after some delay) and then parse to see if indeed the page contains a visible link to webpage. This means that the referring URL is legit and thus add the score for the domain by Y. Else, the URL is spam, penalize domain by Z. Allow X, Y, Z to be customized.
  4. Check the referrer URL or IP address in Project Honeypot or other anti-spam websites to see whether it's a known blacklist. If so, penalize the domain by X.
  5. For each incoming referrer URL, if the domain score is low enough (< X), ban right away. If the domain score is high enough (> Y), allow recording right away.
  6. Also, over time (after Z weeks), purge the records or multiple by a number between 0 and 1 to enforce checking on high-scoring domain from time to time.

I think #2, #3, and #4 will nail most of the referrer spam.

@mattab
Copy link
Member

mattab commented Jan 29, 2012

Like you say this is not a problem today. Anything dynamic will add overhead, so if we do it, in the first version it would be static for sure (for performance). Thanks for the suggestion, maybe we can reassess later.

@robocoder
Copy link
Contributor

See #2268

@anonymous-matomo-user anonymous-matomo-user added this to the Future releases milestone Jul 8, 2014
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc. wontfix If you can reproduce this issue, please reopen the issue or create a new one describing it.
Projects
None yet
Development

No branches or pull requests

3 participants