@mattab opened this Issue on December 23rd 2018 Member

Currently we're using Google Recaptcha on pages with a form, which leaks lots of data to Google.

For example on this page: https://matomo.org/contact/

-> It would be fantastic to find & use an open source, decentralised alternative to Google recaptcha on our Matomo.org website.

If anyone knows an alternative to Recaptcha that works, please let us know

@fdellwing commented on December 24th 2018 Contributor

There are a lot of Captcha-Libaries, but none of them provide such features as reCaptcha.

@Findus23 commented on December 24th 2018 Member

@fdellwing The only feature we need is not getting overwhelmed with spam :slightly_smiling_face:

Bonus points if it is accessibility-friendly.

@fdellwing commented on December 24th 2018 Contributor

As I said, I know no captcha that is nearly as user friendly as reCaptcha. So best would be to take some random image captcha (where are MANY) and just hit an self made database on top that recognises returning users.

@Findus23 commented on December 27th 2018 Member

As I said, I know no captcha that is nearly as user friendly as reCaptcha

I really have to disagree. I regularly spend multiple minutes getting angrier and angrier as I am clicking through page after page arguing whether something can be considered a storefront when the captcha switches into extra-slow mode where every image takes a 5-second transition to load.
(I am not using a VPN or anything similar, just a regular internet connection)

I think a captcha doesn't need to be complex to stop most bots (after all while Recaptcha is hard to circumvent, it only costs 0.2 cent to pay someone to solve it for you), it just needs to be different enough so it stops automated bots programmed to popular wordpress forms.

I even think that a simple input field asking to enter the name of the open source project you are trying to contact (that maybe also allows common variants) would stop nearly all automated spam.
And the remaining ones I think (from what I see on the forum) are actual people pasting spam texts into the forms and those are not blockable via captchas.
@tsteur, would it be possible to add something like this to the forms without too much work?

@tsteur commented on December 27th 2018 Member

As long as there is a wordpress plugin for it that should be fine. We wouldn't want to build anything ourselves. The plugin would ideally hook into random places where needed and support gravity forms etc.

@Findus23 commented on December 27th 2018 Member

https://wordpress.org/plugins/humancaptcha/ seems to be pretty much what I described, but the plugin looks odd and only seems to integrate with comments.
Apart from that I could only find https://wordpress.org/plugins/humancaptcha/ which seamlessly integrates into login, registration, lost password, comments, bbPress and Contact Form 7.

I have never used Gravity forms before, but it seems to have many features and maybe one can make a required input field with the quiz feature Not sure if it can be combined with the normal contact form.

@tsteur commented on December 27th 2018 Member

Did a quick search for "captcha gravity" maybe https://wordpress.org/plugins/nomorecaptchas/ or https://wordpress.org/plugins/cleantalk-spam-protect/ would help? cleantalk also seems to support woocommerce. not really sure how good they are though.

I reckon something where people need to enter "Matomo" might be too complicated sometimes for some humans (it seems easy but may not always be clear what to enter) and at the same time someone wanting to spam us could easily achieve it.

@Findus23 commented on December 27th 2018 Member

https://wordpress.org/plugins/nomorecaptchas/ or https://wordpress.org/plugins/cleantalk-spam-protect/

Both plugins work by sending the visitor behaviour data to the services' servers and analyzing it there. So I guess they are no better than ReCAPTCHA.

It's odd that there isn't a well-maintained opensource plugin that just does basic local analysation.

someone wanting to spam us could easily achieve it.

Targeted attackers will probably always be able to afford the 0.2 cent it costs to reliably circuvent all types of captcha.

@mt-dave commented on May 7th 2019

I would think alternate of recaptcha will be kind of service, something that can solve traditional recaptcha issue like GDPR and accessibility and still provide solution like no captcha.

I came across some solutions and here is a quick summary

Captcha providers can widely be categorized in 2 categories :-

Captcha Service Providers : This option works well for mission critical Enterprises looking for protection against constantly evolving spam and bot threats. Some of the Industry players in Captcha Services are :-

RECAPTCHA : Free and One of the most widely used captcha service used across the globe. They have recently launched recaptcha v3 which generate a risk score based on user behavior on site, google cookies, traffic history etc. GDPR has been a major concern considering what information it stores and uses for other google product like google ads.

MTCaptcha : Captcha Service that is more focused for Enterprise needs. Provide NoCaptcha alternative to recaptcha, captcha account management, GDPR compliant, Availability across globe (China included). Limited in low friction captcha capabilities.

Solve Media captcha: Ad driven Captcha that uses advertisement to generate captcha and solving them. GDPR compliant, Beautiful captcha and customizable. It may not be good idea to show advertisement on enterprise site.

Captcha Library Providers: There are lot of players in Captcha Library space, And if you are willing to manage and setup the code, some of the options are:-

BotDetect CAPTCHA : Most widely used captcha library, Available in multiple languages. They license the library which then need to be implemented and managed.

KeyCAPTCHA - Innovative Anti-Spam Solution : Plugin driven captcha cover wide range of CMS systems. Mostly for CMS driven, need self hosting and management. Permutations are limited for captcha generation.

@Findus23 commented on May 9th 2019 Member

I just came across https://www.phpcaptcha.org/ which seems to be the only local open source captcha solution that has a wordpress plugin: https://wordpress.org/plugins/securimage-wp/

But I don’t know how well it supports the forms used on matomo.org

@Crypto-Loot commented on June 10th 2019

Hi there,
We offer a PoW (proof-of-work) based captcha system where a user must verify a captcha via mining a cryptocurrency for several seconds before proceeding to confirm the token. You may find more at our website: https://crypto-loot.org (will have to login to see the demo/code)

We are also doing a rebrand shortly along with a potential partner to help bring web mining into the white light for the industry.

Please feel free to let us know if you would like to work with us!

@mattab commented on June 29th 2019 Member
@joekarns commented on September 19th 2019

One non-google product you could use to better protect your login page (or any page of the site) would be using the free version of Cloudflare. I use "Page Rules", then configure only my login page with the form on it to be in "under attack" mode in Cloudflare. By doing so, it scans any/all users who try to access that page of the site. It's not a perfect solution but it should cut out most of the pure bots hitting that page. Hope that helps.

@Findus23 commented on September 19th 2019 Member

@joekarns Using Cloudlare might be even worse as it

  • allows one entitiy to intercept a huge fraction of the internets traffic
  • cuts of a large fraction of internet users (e.g. tor users)
  • still uses ReCaptcha (I think) to detect bots
@joekarns commented on September 19th 2019

Yes, fair points.

@mattab commented on January 30th 2020 Member

We're still actively looking for an alternative to Google recaptcha!

if you have any hint, we'd love to hear!

@ara4n commented on February 9th 2020

we are too, over at https://github.com/vector-im/riot-web/issues/3606 (in the interests of sharing any discoveries). (Riot also uses matomo for its analytics, fwiw :)

@raneq commented on February 13th 2020

What about:

  • Captcha code ?? It's up to date and looks clean to me. I don't know if it's effective though.

For the record:

  • Visual Captcha? It's not maintained, but it may just work. Wrong, the wordpress plugin is not up to date.
  • Secure Image It's not the cutest, but it's just a matter of CSS. But it's also not up to date
  • Lepture Captcha: For python projects, it looks a good resource, even thought last commit is from Nov 2018.

It feels like the interest for light and effective captchas has dropped really a lot. Thank you for not surrendering on this.

@Findus23 commented on February 13th 2020 Member

I am really not sure if those Captchas that just use GD to print a random string to an image and sprinkle a few dots or lines above it are really helpful.

  • They are all completely inaccessible, so a lot of people are completely prevented from submitting a form.
  • I really doubt any but the most trivial bot is unable to detect those images via OCR (especially as they use known font-files)
  • most of them seem to have the latest commit many years ago

For captcha-code-authentication specifically there seem to be multiple reviews mentioning that removing the captcha from the form circumvents it.

@mattab commented on March 4th 2020 Member

Btw we could also self-hosted the google recaptcha and proxy requests, this would help people from china at least, and may limit some of the privacy implications? using this: https://github.com/google/recaptcha

PHP client library for reCAPTCHA, a free service to protect your website from spam and abuse. http://www.google.com/recaptcha/

@Findus23 commented on March 4th 2020 Member

self-hosted the google recaptcha and proxy requests

That would solve the issue for chinese users, but it might make privacy even worse as it would be harder to block and might open new privacy law issues as users can't opt out anymore.

@Reechik8760 commented on March 23rd 2020

I'm also looking for a good captcha to use that protects a users privacy. One solution that doesn't work for me but might be ok for you is: https://www.hcaptcha.com/

users are labeling data for free with hcaptcha and we don't know what is being done with the labeled data. As a result I'm not using it.

@tsteur commented on March 29th 2020 Member

I just came across https://www.hcaptcha.com/ as well. It looks quite interesting and there is a WordPress plugin https://wordpress.org/plugins/hcaptcha-for-forms-and-more/

I suppose it's at least better than Google but didn't look into any terms or privacy policy.

@Findus23 commented on March 29th 2020 Member

Things I noticed with hcaptcha:

  • The captcha itself varies from obvious to impossible (three blurry, distorted images and nine equally incomprehensible images and somehow one has to find a connection between them)
  • The tasks are looping in a very small set. If none of the few tasks available at the moment is doable, one can't submit.
  • Their solution for users who can't do visual tests is forcing them to create a account and share their personal data, which doesn't really feel appropriate (https://www.hcaptcha.com/accessibility)
  • The JS mentions that its license can be found at https://hcaptcha.com/license which is a 404
  • They have a privacy policy, but I think it is not linked anywhere (https://www.hcaptcha.com/privacy)
  • Update: It is linked in the captcha itself, but using a 9px font and using #cccccc text on #fafafa background (which is the lowest color contrast I have seen in a long time)
  • Children under 13 are banned from using the service which isn't really an issue but is a bit weird.

Weird quotes from the privacy policy:

Some of the information you provide us may constitute sensitive data as defined in the GDPR (also referred to as special categories of personal data), including identification of your race or ethnicity on government-issued identification documents.

please be aware that your personal data will be transferred to, processed, and stored in the United States. Data protection laws in the U.S. may be different from those in your country of residence. You consent to the transfer of your information, including personal information, to the U.S. as set forth in this Privacy Policy by visiting our site or using our service.

(I don't think that's how consent works)

So I think the major benefits to reCAPTCHA are:

  • it is not Google
  • you get a (potentially very tiny) fraction of the etherum tokens earned
  • it might not do any actual tracking to detect humans
@Reechik8760 commented on March 29th 2020

@Findus23 -- thank you very much for this great analysis. If I find any good open source solutions that protect people's privacy (or end up creating my own Captcha) I will be sure to post it.

Powered by GitHub Issue Mirror