rants from the dark side of marketing

How to break captchas

There’s only a few things I like better than having to type a captcha before I can get to my porn or post to my favourite wob2.0 site, things like getting kicked in the crotch, food poisoning and Alanis Morissette. Without further ado, here is how to break captchas part one.

Wintercore – Breaking Gmail’s Audio captcha

Vorm – Defeating audio (voice) captchas

Always look at the audio version of the captcha you are trying to break.

DefCon 15 – T535 – Black Ops 2007: Design Reviewing The Web

Interesting video all around, skip to last few minutes for an interesting audio captcha breaking approach (you should really watch the whole thing).

AlwaysMoveFast – Cracking CAPTCHAs for Fun and Profit

Useful ideas and sample code (in python).

WebSecurity.com.ua – Various captcha vulnerabilities

WebSecurity.com.ua – More captcha vulnerabilities

Various tricks for bypassing captchas, mostly without actually solving them.

RookSecurity – Captchas

How to break PHP-Nuke’s captcha (used by other applications as well) and Simple Machines Forum’s audio captcha. For the first one, he generated all possible images and stored their hash in a database, so you only had to compare the hash of your captcha to the database to solve it . For SMF’s audio captcha he calculated Hamming distances, although perhaps he should have used the more flexible variant called a Levenshtein distance.

DarkSeoProgramming – PHPBB3 Captcha is super easy

DarkSeoProgramming – Instant GOCR Training

DarkSeoProgramming – Letter Derotation

DarkSeoProgramming – GOCR to Neural Nets Pt 2

DarkSeoProgramming – 10 Steps to Solving a Captcha

DarkSeoProgramming – A custom floodfill routine

DarkSeoProgramming – Replacing GOCR part 1

Lots of good stuff, plus sample source code.

The Captcha Breaker (in OCaml)

This howto will take you through using Captcha Breaker to break a given
captcha. This howto covers only how to use the solvers once you already have
image files. This howto does NOT cover how to extract that file from a web
site, nor how to enter the text back.


Author has decided to release the source code for PWNtcha. Old but probably worth examining.

Captcha Recognition via Averaging

Averaging of a series of images can be used to improve image quality (reduce distortion, or improve signal-to-noise ratio, so to say) of captchas and hence to make them more easily recognizable by OCR (optical character recognition) systems.

BluehatSEO – Del.icio.us Captcha Cracked

BluehatSEO – Captcha Breaking W/ PHPBB2 Example

BluehatSEO – Creating An Army of Free Captcha Typers

The del.icio.us solver uses a useful and easy technique for removing noise, which I like very much and you should learn. It goes through each black pixel, checks all the surrounding pixels if they are black. If the number of black neighbours is below a certain threshold (4 in this case) then you can remove that pixel. You should calculate neighbours for all pixels first, then in one pass change all those below the threshold to white. If you change them incrementally, bad things may happen, so keep that in mind.

You should experiment in each case and find how to get best results. The optimal threshold value is usually 3 or 4. More importantly you can experiment with multiple passes, for example using 2 for the first pass, then 3 for the second and 4 for the final pass.

Another interesting idea is setting up some honeypots so you can get infected by botnets that are going to solve captchas. For example, bots solving Live’s and Google’s captcha. You might be able to either acquire the captcha breaking code or perhaps even better, get access to the service that bots are sending their captchas to for solving.

Even more related links:

Yahoo/Hotmail/Google CAPTCHA Extraction

Using AI to beat CAPTCHA

Old Yahoo Captcha solver (now fixed)







Using WhatTheFont

Breaking CAPTCHAs without using OCR

Posted on Friday, May 30th, 2008 at 10:24 am under Rants. You can skip to the end and leave a response. Pinging is currently not allowed.


roddik Says:

Hey, has anyone managed to download pwntcha? It says “It can be downloaded from Subversion: svn co svn://svn.zoy.org/libcaca/pwntcha/trunk pwntcha” but wtf that is and I never managed to actually get those files. Could you please post an alternative link?

antelle Says:

roddik, it works now

potlord Says:

You can also use the given Repo url by leaving out “pwntcha” after trunk….


should work for you

Starck Says:

Very nice links to train my skills cracking captchas!

Thanks to the author!

James Says:

Captcha Solvers like deathbycaptcha.com and bypasscaptcha.com

Leave a Reply

You must be logged in to post a comment.



RSS feed





Content may be king, but distribution pays the king’s mortgage.

8/12/09» 15:51» link» comments

Google acquired reCaptcha about a month ago, you might want to throttle your reCaptcha solving per IP address from now on.

14/10/09» 16:22» link» comments

Matt Cutts on how Google deals with spam.

7/10/09» 14:31» link» comments

Why you don’t want to shard.

Real World Web: Performance & Scalability.


Gearman is interesting.

31/08/09» 4:46» link» comments
Copyright 2008, blackhat-seo.com