When you are driving a car, your brain is taking in an enormous amount of visual information and using it to make driving decisions, such as when to brake or change lanes. The brain needs to determine what kind of information in your field of view is necessary for making these decisions. For example, the position of another car is very important, but a cloud in the sky or the color of that car does not really have an impact on the way you drive.
This is an everyday example of decision-making in a complex natural environment. What is the brain doing in such situations where there is a high volume of sensory data and a need to rapidly make decisions? To study this and related questions, researchers can experiment using simulations from our everyday life: video games.
A new study from Caltech compares brain scans of humans playing classic Atari video games to sophisticated artificial intelligence (AI) networks that have been trained to play the same games. Led by graduate student Logan Cross, the researchers compared the trained AI's behavior with that of humans and discovered that the activity in the artificial "neurons" in the AI looked quite similar to activity in the human brain. This implies that the AI agent may solve these decision-making tasks similarly to the human brain, making it a good model for studying how the human brain maps high-dimensional visual input into actions in complex environments.
The study was conducted in the laboratory of Professor of Psychology John O'Doherty. A paper describing the research appeared in the journal Neuron on December 15. O'Doherty is an affiliated faculty member with the Tianqiao and Chrissy Chen Institute for Neuroscience at Caltech.
"The interaction between AI and neuroscience goes both ways," says O'Doherty. "If we can find out how similar AI algorithms are to the brain, this helps us better understand how the brain solves these kinds of hard problems, but conversely if we can understand why and how the brain can solve these games much more efficiently compared to an AI, this may help guide the development of smarter and more humanlike AI algorithms in the future."
In the field of decision neuroscience, which examines the way neural activity in the brain gives rise to decision-making, many studies use simple tasks to examine how humans make decisions. For example, a study participant might be asked to play two slot machines with different payouts. Over the course of the experiment, the participant will learn which slot machine earns more money and adjust their behavior accordingly. The general learning framework for solving these tasks is called reinforcement learning because behavior is reinforced by the rewarding outcomes that result from decisions.
However, the reinforcement-learning framework alone does not adequately describe decision-making in larger and more complicated tasks. In 2015, DeepMind, an artificial intelligence company owned by Google, developed a complex artificial intelligence algorithm, called the Deep Q Network (DQN), that could learn to play dozens of Atari video games at human or superhuman levels.
The DQN combines the classic reinforcement learning framework with another recent advancement called a convolutional neural network. The convolutional neural network acts as a perceptual system that learns to detect visual features in the Atari pixel space (the game screen) that are predictive of reward (scoring points). This enables the DQN to learn which actions to take in a given situation just by looking at the pixels in the game. Importantly, the rules of the game are not programmed into the DQN agent; it must learn for itself how the game is played through trial and error, as good decisions are positively reinforced when the agent scores points (along with the actions leading up to the score).
In this study, the DQN was trained on the Atari video games Pong, Space Invaders, and Enduro (a racing game), and then its artificial neurons were used to predict behavior and brain activity from functional magnetic resonance imaging brain scans of human participants as they played the games. In particular, the researchers found that brain activity in two brain regions involved in perception and vision, the dorsal visual pathway and the posterior parietal cortex, could be modeled using DQN features.
In all of the games, the DQN must learn how to pick out the relevant features from a large volume of visual input just as a human would. It must format this relevant information in what is called a state-space, which is a compact way to represent what is going on in the current state of the game. For example, in Pong, the researchers found that the state-space in the DQN codes for the spatial positions of the ball and paddles; it ignores features like the colors of the background and the game score at the top of the screen. This is very similar to how the human brain represents the game in the dorsal visual pathway—the part of the brain that recognizes where objects are in space to guide actions related to those objects.
In the game Enduro, the player drives a car as fast as possible and tries to avoid other cars. During the drive, the sky changes color from day to night. It is easy for a person playing the game to ignore these parameters, as they are irrelevant to the actual game, in the same way that we learn to ignore clouds in the sky when driving a car. But an AI network must learn that the changing color of the sky has no impact on driving.
Researchers found that the features within the DQN that ignore these irrelevant visual features better explained the patterns of brain activity seen in the game-playing volunteers' posterior parietal cortex, the part of the brain that connects perception to motor movement. Similar results were also found in Space Invaders.
While the researchers have found similarities between the DQN and the human brain, the two are not identical.
"It takes days of nonstop playing for DQN to learn to play these games, but humans can learn in minutes," says Cross. "Why is it easy for human brains to figure out what the relevant features are when driving a car but hard for an artificial intelligence? Answering this question is a grand challenge for AI researchers. It is hard for AI because as the background colors change, its visual input dramatically changes as it just 'sees' numbers in the pixel space. It takes a lot of training for DQN to learn that two situations that are dramatically different in the pixel space are actually conceptually similar in terms of what you should do."
On the other hand, Cross adds, the human brain is shaped throughout its development to learn to pick out the most important information for common daily tasks. "The dorsal visual pathway in particular, which is our main region of interest, is able to rapidly localize objects independently of their colors," he says. "Additionally, the brain somehow encodes common-sense notions of physics and how objects typically move, which allows humans to perform a wide variety of tasks well with little training. All of this has to be learned from scratch by DQN."
In recent years, other research has discovered similarities between the brain and deep neural networks, but most of these studies have focused on object recognition rather than active decision-making. This study introduces a new framework for studying behavior and brain activity in complex decision-making tasks that may be more representative of daily life than the tasks previously used in the field.
The paper is titled "Using deep reinforcement learning to reveal how the brain encodes abstract state-space representations in high-dimensional environments." In addition to Cross and O'Doherty, additional co-authors are Jeff Cockburn, postdoctoral scholar research associate in neuroscience, and Yisong Yue, professor of computing and mathematical sciences. Funding was provided by the National Institute on Drug Abuse and the National Institute of Mental Health, and the Caltech Conte Center for the Neurobiology of Social Decision-Making.