Using ALTCHA to stop spambots without training ML models

A company I contract with recently asked me to help reduce fake signups on their registration page. The standard answer is a CAPTCHA, but I don’t like them:

CAPTCHA companies train machine-learning (“AI”) models with the results, and often sell them to dubious customers.
CAPTCHAs often depend on a third-party service, which could go down.
Absolutely nobody likes solving the little puzzles!

So, when researching relatively ethical CAPTCHA options, I found an alternative: a “proof-of-work” CAPTCHA called ALTCHA.

Instead of asking the user to solve a visual puzzle, it asks the user’s device to solve a computational puzzle. The idea is:

Most spambots on the web run very simple web browsers that can’t run the puzzle code.
For spambots that can run the puzzle, it’s expensive. A few seconds of CPU time isn’t a big deal for one real person, but can add up quickly for someone trying to attack at scale.

I think there are useful questions to ask here, like: what would be the environmental impact of a large product adopting this technique? How disproportionately does this affect users with lower-end devices?

But I decided that, compared to the alternatives, and at this product’s relatively small scale, this was the most ethical spam prevention available to us.

Implementing ALTCHA without third-party APIs

One thing I appreciate about ALTCHA is that it can run in your own app, instead of depending on a third-party server.

To achieve this, ALTCHA provides libraries for multiple languages, and also pseudocode for how to generate and validate puzzles yourself.

Personally, I like to only run third-party libraries that I trust very, very much. So, rather than use ALTCHA’s Ruby library, I opted to implement a small version of the code in Ruby myself, using their pseudocode and Ruby library as a reference.

(It’s usually very dangerous to implement cryptographical code yourself, even if you’re relatively familiar with it! But in this case, the risk is just “spam could get through”, and I think that’s better than the risk of supply-chain attacks.)

For the frontend component, I decided not to try to implement it myself, and to use the library ALTCHA provided instead: it’s a lot more complex than the backend code, and I consider frontend libraries a bit less dangerous than backend libraries!

But, to manage the trust risk: I copy-pasted the code rather than using an npm reference, and I also audited the library for suspicious things. (There was one encoded blob of JS, which I decoded and confirmed it’s just the expected web worker logic.)

So, the workflow in my app is:

When generating the signup form, create a computational “puzzle”, and embed it in the HTML.
When the page loads, the frontend Javascript code starts to solve the puzzle in the background.
After a few seconds, a checkmark appears, and the puzzle solution is added to the HTML form data.
When the form is submitted, we validate the puzzle’s solution, and check its “signature” to confirm it’s a puzzle we generated.

(In my case, I skipped the “nonce” step that would prevent an attacker from simply solving the puzzle once and reusing it forever. This is because I don’t think we’re dealing with targeted attacks built around our site in particular. If we were, that would be essential!)

Success!

And, ta da! Now I have a CAPTCHA alternative running in my own app, with:

No ML models being trained!
No dependence on a third-party server!
No puzzles for the user to solve!

And I’ve confirmed with the team: we saw a complete removal of spam signups, and legitimate-seeming signups seem unchanged.

So! I was surprised and impressed by how well this worked, and I hope it can be helpful to others who are looking for CAPTCHA alternatives, too!