Using ALTCHA to stop spambots without training ML models
A company I contract with recently asked me to help reduce fake signups on their registration page. The standard answer is a CAPTCHA, but I don’t like them:
- CAPTCHA companies train machine-learning (“AI”) models with the results, and often sell them to dubious customers.
- CAPTCHAs often depend on a third-party service, which could go down.
- Absolutely nobody likes solving the little puzzles!
So, when researching relatively ethical CAPTCHA options, I found an alternative: a “proof-of-work” CAPTCHA called ALTCHA.
Instead of asking the user to solve a visual puzzle, it asks the user’s device to solve a computational puzzle. The idea is:
- Most spambots on the web run very simple web browsers that can’t run the puzzle code.
- For spambots that can run the puzzle, it’s expensive. A few seconds of CPU time isn’t a big deal for one real person, but can add up quickly for someone trying to attack at scale.
I think there are useful questions to ask here, like: what would be the environmental impact of a large product adopting this technique? How disproportionately does this affect users with lower-end devices?
But I decided that, compared to the alternatives, and at this product’s relatively small scale, this was the most ethical spam prevention available to us.
Implementing ALTCHA without third-party APIs
One thing I appreciate about ALTCHA is that it can run in your own app, instead of depending on a third-party server.
To achieve this, ALTCHA provides libraries for multiple languages, and also pseudocode for how to generate and validate puzzles yourself.
Personally, I like to only run third-party libraries that I trust very, very much. So, rather than use ALTCHA’s Ruby library, I opted to implement a small version of the code in Ruby myself, using their pseudocode and Ruby library as a reference.
(It’s usually very dangerous to implement cryptographical code yourself, even if you’re relatively familiar with it! But in this case, the risk is just “spam could get through”, and I think that’s better than the risk of supply-chain attacks.)
For the frontend component, I decided not to try to implement it myself, and to use the library ALTCHA provided instead: it’s a lot more complex than the backend code, and I consider frontend libraries a bit less dangerous than backend libraries!
But, to manage the trust risk: I copy-pasted the code rather than using an npm reference, and I also audited the library for suspicious things. (There was one encoded blob of JS, which I decoded and confirmed it’s just the expected web worker logic.)
So, the workflow in my app is:
- When generating the signup form, create a computational “puzzle”, and embed it in the HTML.
- When the page loads, the frontend Javascript code starts to solve the puzzle in the background.
- After a few seconds, a checkmark appears, and the puzzle solution is added to the HTML form data.
- When the form is submitted, we validate the puzzle’s solution, and check its “signature” to confirm it’s a puzzle we generated.
(In my case, I skipped the “nonce” step that would prevent an attacker from simply solving the puzzle once and reusing it forever. This is because I don’t think we’re dealing with targeted attacks built around our site in particular. If we were, that would be essential!)
Success!
And, ta da! Now I have a CAPTCHA alternative running in my own app, with:
- No ML models being trained!
- No dependence on a third-party server!
- No puzzles for the user to solve!
And I’ve confirmed with the team: we saw a complete removal of spam signups, and legitimate-seeming signups seem unchanged.
So! I was surprised and impressed by how well this worked, and I hope it can be helpful to others who are looking for CAPTCHA alternatives, too!