Cars that drive themselves. A program that trounces the best human players at Jeopardy! A machine that defeats the world champion at chess. It would seem that the quest to create human-level artificial intelligence (AI) is making astounding progress, and the end is in sight. But is it?
This year, Google’s self-driving car was granted a special license plate by the State of Nevada, allowing it to operate on public roads in the state . In 2007, six different driverless cars successfully completed the “Grand Challenge” organized by DARPA (the US government agency that funded the invention of the Internet), which entailed navigating a 60-mile simulated urban course. Similarly, in 2005, five such cars successfully navigated a 132-mile desert course .
In 2011, IBM’s Watson computer handily defeated two of the world’s best human Jeopardy! players in a televised match . In 2007, checkers was pronounced to be “solved” — a computer program had been developed that could never be beaten . In 1997, IBM’s Deep Blue computer defeated reigning world chess champion Garry Kasparov, winning 3½ to 2½ in a six-game match .
The founders of AI, who coined the term in 1956, would have been astounded by the list of achievements accomplished by computers to date. And yet, with every new achievement, there is a mood of cautious optimism rather than exuberance, and AI skeptics abound. What do we know now that the founders of AI didn’t know back then? The answer, interestingly, can be found in a paper that pre-dates the official founding of AI.
The Turing test
In 1950, British mathematician Alan Turing wrote a paper, titled “Computing Machinery and Intelligence“, in which he examined the question, “Can machines think?” Turing’s answer was yes, and he predicted that, by as early as the beginning of the 21st century, machines would be able to pass a test of intelligence that he proposed, now famously known as the Turing test. A human interrogator is tasked with determining which of two chat-room participants is a computer, and which is a real human. The interrogator can say or ask anything, but all interaction is solely through typed text. If the interrogator cannot distinguish computer from human with better than 70% accuracy after 5 minutes of interacting with each participant, then the computer has passed the test.
The Turing test is not without its detractors, who have pointed out several problems. The first is that the test is really one of the ability to mimic humans, which is not the same as being intelligent. There is plenty of human behavior that is rather unintelligent, such as making spelling errors. In the context of a real-time text chat, having perfect spelling all the time is a dead giveaway for a computer. Similarly, being able to rapidly answer complex arithmetic problems is another telltale sign. For example, dividing one 100-digit number by another is trivial for a computer, but very time-consuming for a human. Thus, current computer programs that come closest to passing the Turing test are often those that focus on mimicking such idiosyncrasies, rather than programs that focus on replicating intelligence.
Another important criticism of the Turing test is that it presumes that human-like intelligence is the only form of intelligence (Figure 1). As an analogy, nobody would dispute that airplanes can fly, but airplanes don’t flap their wings – which is how flight is achieved in the natural world. Similarly, it may be incorrect to assume that any intelligent entity must possess the kind of human-like intelligence that we are naturally familiar with.
Figure 1. Pitfalls of the Turing Test (purple intersection) as a test of intelligence. Humans can display unintelligent behavior (red circle), and humans cannot possibly display all forms of intelligent behavior (blue circle) — such as ultra-fast complex mathematical calculations.
These are just two examples, among various criticisms of the Turing test . So why has this test endured as the most well-known test for AI? The answer lies in the spirit, rather than the wording, of Turing’s proposal. To put the Turing test into context, the transistor had just been invented a few years prior to the proposal of the test in 1950. Computers, as we know them, did not exist yet. Ironically, “computers” were in fact humans that were employed to manually perform calculations! Given these facts, it is quite understandable why Turing proposed that if a machine could fool a human into believing it was another human, it should most certainly be considered intelligent.
What, then, is the spirit behind the test that Turing put forth? He could have proposed a test involving complex arithmetic. Or solving differential equations. Or remembering large amounts of information. For humans, proficiency in any of these would be taken as evidence of intelligence. However, Turing recognized that the nature of machines and humans are fundamentally different, and machines are naturally adept at certain things that the average human being is bad at (such as performing complex mathematical calculations).
Turing understood that what makes human intelligence valuable is not its quantitative aspects (the “horsepower”). For example, once you know how to perform multiplication, multiplying two 1,000,000-digit numbers is as “easy” as multiplying two 10-digit numbers — it is only a matter of performing the same basic steps over and over. Instead, the key to human intelligence is in its qualitative aspects. What this actually means is still poorly understood, but it seems to involve flexible thinking to deal with a broad range of situations, including new ones that have not been encountered before. This is why Turing proposed a relatively unconstrained interaction as a test for intelligence.
Are we there yet?
Coming back to the impressive strides made by computers… should these be taken as evidence that we have begun to crack the AI problem? To provide the proper perspective, we should take a closer look at why computers have surpassed humans in certain “intellectual” endeavors.
To beat world chess champion Garry Kasparov, IBM’s Deep Blue relied heavily on its quantitative, brute-force characteristics. It possessed a database of the optimal chess moves for all possible scenarios when only a small number of chess pieces were left. When there were more pieces than this number, Deep Blue would perform a brute-force search process examining the consequences of all possible next moves, and the next moves after those, and so on. Human players do this too, but to a very limited extent. However, like with the analogy of multiplying 1,000,000-digit numbers versus 10-digit numbers, the basic steps are essentially the same. With the benefit of hindsight, since chess can be “solved” using this brute-force approach, at least in theory, perhaps it is not surprising that with exponential increases in computing power, computers would inevitably beat humans at chess, even without any “real intelligence” under the hood.
Current computers are still far from approaching the flexibility of human intelligence that is required to deal with the unconstrained nature of life. If Deep Blue were tasked to play Monopoly instead of chess, would it even know what to do, or would it essentially have to be reprogrammed? If Watson were tasked to use its immense knowledgebase to list the arguments for and against abortion, would it even know that thousands of articles could be summarized into perhaps a few dozen points?
Computers may in fact pass the Turing test in the near future (one current prediction is for the year 2029 ). But that would be beside the point. Until we attempt to create computers that succeed at the broad range of activities that even the average human can perform reasonably well, we will not achieve true artificial intelligence.
Cheston Tan was a recent PhD graduate from the Center for Biological and Computational Learning at MIT. He is currently a postdoctoral researcher at the Institute for Infocomm Research, Singapore.
 Slosson, S. “Google gets first self-driven car license in Nevada”, Reuters (May 8, 2012) http://www.reuters.com/article/2012/05/08/uk-usa-nevada-google-idUSLNE84701320120508
 DARPA Urban Challenge. http://archive.darpa.mil/grandchallenge/
 Markoff, J. “Computer Wins on ‘Jeopardy!’: Trivial, It’s Not”, New York Times (Feb 16, 2011) http://www.nytimes.com/2011/02/17/science/17jeopardy-watson.html?_r=1
 Sreedhar, S. “Checkers, Solved!” IEEE Spectrum (July 2007)
 Long, T. “May 11, 1997: Machine Bests Man in Tournament-Level Chess Match”, Wired (May 11, 2007) http://www.wired.com/science/discoveries/news/2007/05/dayintech_0511
 “Turing Test” http://en.wikipedia.org/wiki/Turing_test#Weaknesses_of_the_test
 A Long Bet: “By 2029 no computer – or “machine intelligence” – will have passed the Turing Test.” (Mitchell Kapor vs. Ray Kurzweil) The Long Now Foundation. http://longbets.org/1/