Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature to delete historical referral spam #8404

Closed
hi2u opened this issue Jul 22, 2015 · 2 comments
Closed

Feature to delete historical referral spam #8404

hi2u opened this issue Jul 22, 2015 · 2 comments
Labels
Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc. not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org. trackingspam For issues related to receiving tracking requests from spammers and bots.

Comments

@hi2u
Copy link

hi2u commented Jul 22, 2015

It's great that Piwik now keeps an up to date list of referral spammers, but this doesn't seem to affect any stats in the past?

It would be great if there were a feature to delete all historical data from referral spam bots. This could either be triggered automatically, or maybe with a button in the web interface or command on the server.

I only recently just upgraded my Piwik installation, so most of my stats are still clogged up with referral spam.

There's also the fact that even with the new auto-updating spammer list, there will always be a gap between the time spammers are discovered, added to the list, and then received in the Piwik installation. So if we had a feature that deleted historical spam, that would fix up that gap too going forward, giving everyone an "eventually pure" spam free database.

Another bonus advantage is reducing the size of the database. 99.99% of us don't want this data at all, and for small websites it can easily account for 50% to 99% of your stats.

Maybe a very small number of people want referral spam kept, so for those who do, it could be made optional. Although I think auto-deleting junk data would be a sensible default.

If somebody is able to implement this and it's going to be a manually triggered thing, it would be great if it can be applied to all websites in the Piwik installation in one go.

@mnapoli mnapoli added the Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc. label Jul 23, 2015
@nekohayo
Copy link

nekohayo commented Sep 7, 2015

Aye, some people have been requesting this in http://forum.piwik.org/read.php?2,127138 for example.

In #3385 (comment) @mattab has mentioned a SQL command to delete some crap from (I think) the two main affected tables (besides archives), "log_visit" and "log_link_visit_action". I've been scratching my head tonight trying to understand that sample SQL command to see if I could adapt it to this case, to search for those blacklisted referrers in the "referrer_url" or "referrer_name" fields/columns… but I didn't figure it out. I think the solution (in terms of the command) is probably 99% there but I'm not familiar with SQL (yet), the logic in that request eludes me.

But a feature in Piwik that would take care not only of the logs but also the archives (because reprocessing years of archives for multiple sites is long!) would be absolutely wonderful.

@mattab mattab added this to the Mid term milestone Sep 18, 2015
@mattab mattab modified the milestones: Long term, Mid term Dec 5, 2016
@sgiehl sgiehl added the trackingspam For issues related to receiving tracking requests from spammers and bots. label Nov 26, 2020
@tsteur
Copy link
Member

tsteur commented Nov 30, 2020

This is possible with our GDPR tool where you can delete visits and the reports will then be re-archived. Closing this issue as we've used it before for exactly this reason and it works well.

@tsteur tsteur closed this as completed Nov 30, 2020
@tsteur tsteur added the not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org. label Nov 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc. not-in-changelog For issues or pull requests that should not be included in our release changelog on matomo.org. trackingspam For issues related to receiving tracking requests from spammers and bots.
Projects
None yet
Development

No branches or pull requests

6 participants