Innovative thinking: reCAPTCHA
Chris Harris
By turning a problem’s definition on it’s head - the reCAPTCHA team at CMU has done a remarkable job of innovating a solution to two problems at once.
You know the jumbled up words you have to type in to a website if you forget your password or post a comment on some blogs? Those are called CAPTCHAs. Their designed to try to determine that you’re in fact human. These are very common - there are more than 60 million CAPTCHAs being answered every day! Here are some great examples from Greg Mori’s research on breaking CAPTCHAs!
(A really hard one)

(A much easier one - Mori’s program, EZ-Gimpy, can beat 90% of ones like this)

The folks at CMU asked the question in reverse. What problems are so hard for computers - that even at state of the art we’d rather have a person solve them? The reCAPTCHA team thought that Optical Character Recognition (OCR) problems were the answer.
OCR is far from perfect - and the Internet Archive is scanning tons of books - some of which have degraded looking text. Here’s a great example from their website:

So how can these be fixed?
The idea is two have a program generate a known random word and convolute it to make it hard for a computer to read. Simultaneously, it selects one of these pre-OCR’d words at random (from the top line) that it’s having difficulty OCR’ing. Then, the human is asked to correctly type in both words. If the response is correct for the known word, then it’s assumed that the response was done by a person, and records the person’s answer to the second (unknown) word!
Now, people aren’t perfect either. So in reality there are a lot of complexities here. To name a few: more than one person is shown a particular group of characters to verify that they agree on what the correct answer is, the order of the known vs. unknown words are randomly chosen, the degradedness of the word images chosen is optimized to be most beneficial for both security & effectively leveraging the human work for OCR.
The project is called reCAPTCHA and you can learn more about reCAPTCHA by going to the project page itself. Great job guys!!
Posted in Innovation, Solutions, Technology |



