What babies tell us about artificial intelligence

They may outwit the chess Grandmaster Kasparov, but can machines ever be as smart as a three-year-old?

    Instead of trying to produce a programme to simulate the adult mind, why not rather try to produce one which simulates the child’s? – Alan Turing, 1950.

In an “Ideas Lab” discussion at Davos on 20 January, four academics from Berkeley, my university, will be debating whether computers can (Or will? Or might sometime in the future? Or can’t possibly?)  make decisions better than people (the tenses are a big part of the debate).

The group will include a roboticist, a computer scientist, a neuroscientist and me – a developmental psychologist. Why will I be there? It turns out that children are a crucial but underappreciated part of the debate over artificial intelligence.

Turing’s intelligence test

Everyone remembers that Alan Turing proposed the imitation game to test whether a machine was intelligent. If a human being sat at a keyboard and couldn’t tell whether she was talking to a machine or a person, then the machine would have passed “the Turing test.”

Almost no one remembers that in the very same paper Turing suggested that the key to achieving intelligence would be to design a machine that was like a child, not an adult. He pointed out, presciently, that the real secret to human intelligence is our ability to learn. Learning has been at the center of the new revival of AI. For the last 15 years or so computer scientists and developmental cognitive scientists have been trying to figure out how children learn so much so quickly, and how to design a machine that could do the same.

The history of AI is fascinating because it’s been so hard to predict which aspects of human thought would be easy to simulate and which would be difficult. At first, we thought that things like playing chess or proving theorems — the corridas of nerd machismo — would prove to be hardest for computers. In fact, they turn out to be easy. Things every fool can do like recognizing a cup or picking it up turn out to be much harder. And it turns out to be much easier to simulate the reasoning of a highly trained adult expert than to mimic the ordinary learning of every baby. So where are machines catching up to children and what kinds of learning are still way beyond their reach?

In the last 15 years we’ve discovered that even babies are amazingly good at detecting statistical patterns. Machines have also become remarkably good at statistical learning. Techniques like “deep learning” can detect even very complicated statistical regularities in enormous data sets. The result is that computers have suddenly become able to do things that were impossible before, like labeling internet images accurately.

But the trouble with this sort of purely statistical machine learning is that it depends on having enormous amounts of data, and data that is predigested by human brains. And yet, even with all that help, machines still need gigantic data sets and extremely complex computations to be able to look at a new picture and say “kitty-cat!”—something every baby can do with just a few examples.

More profoundly, you can only generalize from this kind of statistical learning in a limited way, whether you’re a baby or a computer or a scientist. A more powerful way to learn is to formulate hypotheses about what the world is like and test them against the data. Back in the sixteenth century, Tycho Brahe, the Google Scholar of his day, amalgamated an enormous data set of astronomical observations, and he could use them to predict star positions in the future. But in the following century, Johannes Kepler’s heliocentric hypotheses allowed him to make unexpected, wide-ranging, entirely novel predictions that were well beyond Brahe’s ken. Preschoolers can do the same thing.

One of the other big advances in machine learning has been to formalize and automate this kind of hypothesis-testing. Introducing Bayesian probability theory into the learning process has been particularly important. We can mathematically describe a particular causal hypothesis, for example, and then calculate just how likely that hypothesis is to be true, given the data we see. Machines have become able to test hypotheses against the data in this way extremely well, with consequences for everything from medical diagnosis to meteorology. When we study young children they turn out to reason in a similar way and that may help to explain how they can learn so much.Three differences between children and computers

But there are three things even very young human children do that are still very far from anything that current computers can do – or even that we can visualize them doing in the near future. And they make the gulf between artificial intelligence and human intelligence particularly vivid.

Computers have become extremely skilled at making inferences from structured hypotheses, especially probabilistic inferences. But the really hard problem is deciding which hypotheses, out of all the infinite possibilities, are worth testing. Even preschoolers are remarkably good at creating brand new, out-of-the-box concepts and hypotheses in a creative way and then testing them. Somehow they combine rationality and irrationality, systematicity and randomness to do this, in a way that we still haven’t even begun to understand.

In another neglected part of his landmark paper, Turing presciently argued that it might be good if his child computer acted randomly, at least some of that time. And three year olds’ thoughts and actions often do seem random, even crazy – just join in a pretend game sometime. This is exactly why psychologists like Piaget thought that they were irrational and illogical. But they also have an uncanny capacity to zero in on the right sort of weird hypothesis – in fact, research in our lab has shown that they can be substantially better at this than grown-ups.

In one experiment, for example, we gave four-year-olds and Berkeley undergraduates a new gadget to figure out – a machine that lit up when you put some things on it and not others. We showed the kids and adults the machine working in an obvious way or a more unusual way, and then let them figure out how to make it go with some new blocks. The kids quickly learned the unusual hypothesis but the adults stuck with the obvious idea, in spite of the data. This might seem surprising at first. But maybe not if, like me, you’ve watched your 3-year-old grandchild quickly and spontaneously master the smartphone that took you a month to figure out.

We think that the kids were exploring possibilities in a wider more unpredictable way than grown-ups just as they do when they “get into everything” in their exploratory play, or think up crazy alternatives to reality in their pretending. But we have almost no idea how this sort of rational randomness, this constrained creativity, is possible.

A second area where children outshine computers is in their ability to actually go out and explore the world around them. Turing refers to this as “the ability to go fetch the coal scuttle” and suggests that computers can do without it. But, after all, experimentation is a crucial, and rather mysterious, part of scientific learning.

More and more studies show that even very young babies actively explore the world around them, and that this kind of active exploration is crucial for learning. Babies systematically look longest at the events around them that are most likely to be informative, and they play with objects in a way that will teach them the most. This ability to actually extract the right data from the world, rather than just processing the data that you are given, is very powerful. It also involves the same tension between rationality and randomness as children’s creative hypothesis generation. It is still far beyond the capacity of any machine we know of.

A third way that children learn is by getting information from the other people around them. Most people don’t appreciate that many of the recent successes of machine learning also depend deeply on information from humans. Computers can only recognize internet images because millions of real people have reduced the unbelievably complex information at their retinas to a highly stylized, constrained and simplified instagram of their cute kitty, and have clearly labeled that image, too. (In some ways the dystopian fantasy of “The Matrix” is a simple fact, we’re all actually helping computers become more powerful, under the anesthetizing illusion that we’re just having fun with lol cats.) And they can translate, more or less, because they can take advantage of enormous databases of human translations.

Computers can’t roll their eyes

But current computers use the information from people in a relatively simple and thoughtless way. New studies show that, in contrast, even three-year-olds can assess the testimony of other people in a surprisingly sophisticated way. They can tell whether someone is reliable or wonky, naïve or expert. They can even use very subtle cues to tell whether a grown-up is just saying what they think, or is intentionally trying to be instructive. And very young children learn differently depending on what they think about the person talking to them. (Even the most sophisticated computers have yet to master the ability to roll their eyes at adult fatuity).

Of course, Turing’s great insight was that once we have a complete step-by-step account of any process we can program it on a computer. This is where the tenses come in – at the moment it’s a matter of faith about whether such an account of human intelligence will be possible or what it will look like. As scientists, it’s natural to want to endorse that faith, and to hope that such an account will be possible, though at this point, it’s still no more than faith and hope.

And, after all, we know that there are intelligent physical systems that can do all these things. In fact, most of us have actually created such systems and enjoyed doing it too (well, at least in the earliest stages). We call them our kids. Computation is still the best, indeed the only, scientific explanation we have of how a physical object like a brain can act intelligently. But, at least for now, we have almost no idea at all how the sort of intelligence we see in children is possible. Until we do, the largest and most powerful computers will still be no match for the smallest and weakest humans.

Author: Alison Gopnik is Professor of Psychology and Affiliate Professor of Philosophy at the Tolman Hall University of California at Berkeley. She is participating in the World Economic Forum’s Annual Meeting in Davos 2015. This post is an expanded version of a post on the Edge website.