Deciphering Decaptcher’s protocol
Decaptcher has a TCP socket and an HTTP API. This is a description of the socket API, deciphered from Decaptcher’s official PHP client. You can also look at my version of the Decaptcher PHP client that I posted recently. The following will come in handy if you’d like to code your own client.
The socket API uses a 6 byte header containing protocol_version/command_code/data_size. Let’s call this the cc-header.
protocol_version is there for keeping old code functioning when the protocol changes. Currently, version is still 1.
command_code is used for actual commands as well as error codes.
data_size tells you the size of data in bytes following the cc-header. Size often is 0.
If you are sending a picture or receiving picture text there’s a second header following the cc-header. Let’s call this the pic-header.
The pic-header is 20 bytes long and contains pic_timeout/pic_type/data_size/major_id/minor_id.
pic_timeout is used to tell Decaptcher how much time it has to get the captcha back to you.
pic_type serves as an affiliate application id, as far as I know.
data_size is the size in bytes that will follow the pic-header.
major_id and minor_id are sent to you when you get picture text, they are used when reporting picture text as bad.
You append your image binary after the pic-header, when you are sending.
Picture text comes after the pic-header, when you are receiving.
Logging in:
- Open socket to Decaptcher’s server.
- Send cc-header, command_code=1, followed by your username.
- Receive 32 byte salt, with command_code=3.
- Using sha256, hash the salt,md5 of your password and username (in this order).
- Send the hash along with a cc-header and command_code=4.
- Receive cc-header with command_code=7.
Sending a picture:
- Must be logined.
- Send cc-header with command_code=12, then a pic-header and then the picture binary.
- Wait on the socket until you get a cc-header with command_code=14. A pic-header and picture text will follow. You must store the major_id and minor_id you get in the pic-header preceding the picture text (in order to report bad picture text).
- If command_code is not 14 then it’s an error code.
Notifying of bad picture text:
- Must be logined.
- Send a cc-header with command_code=13, then a pic-header containing the major_id and minor_id that came back with the picture text.
Getting your API credits balance:
- Must be logined.
- Send a cc-header with command_code=10.
- Receive a cc-header with command_code=10 and the balance follows as text.
Logging out:
- Must be logined.
- Send a cc-header with command_code=2.
- Close the socket.
The command codes:
‘cmdCC_UNUSED’, 0
‘cmdCC_LOGIN’, 1 // login
‘cmdCC_BYE’, 2 // end of session
‘cmdCC_RAND’, 3 // random data for making hash with login+password
‘cmdCC_HASH’, 4 // hash data
‘cmdCC_PICTURE’, 5 // picture data, deprecated
‘cmdCC_TEXT’, 6 // text data, deprecated
‘cmdCC_OK’, 7 // ok
‘cmdCC_FAILED’, 8 // failed
‘cmdCC_OVERLOAD’, 9 // server overloaded
‘cmdCC_BALANCE’, 10 // zero balance
‘cmdCC_TIMEOUT’, 11 // time out occured
‘cmdCC_PICTURE2′, 12 // picture data
‘cmdCC_PICTUREFL’, 13 // picture failure
‘cmdCC_TEXT2′, 14 // text data
Picture timeout codes:
‘ptoDEFAULT’, 0 // default timeout, server-specific
‘ptoLONG’, 1 // long timeout for picture, server-specfic
‘pto30SEC’, 2 // 30 seconds timeout for picture
‘pto60SEC’, 3 // 60 seconds timeout for picture
‘pto90SEC’, 4 // 90 seconds timeout for picture
The default picture type:
‘ptUNSPECIFIED’, 0 // picture type unspecified
One Comment
October 17th, 2009 at 9:40 pm
well informed but i still can’t understand.. seems not east to get the picture to type….
Leave a Reply