Summary: Researchers have identified two different neuronal groups that help the brain assess risk and reward in decision-making. These neurons located in the abdominal striatum individually handle better than expected and worse than expected outcomes. In experiments using mice, silencing these neurons changed reward predictions and influenced decision-making behavior.
This study suggests that the brain tracks all possible rewards rather than averages consistent with machine learning models of decision making. When confirmed in humans, this finding can explain the difficulties in assessing risks seen in conditions such as depression and addiction. Future research will explore how uncertainty affects this brain circuit.
Important facts:
Two Neural Groups: One group handles better results than expected, while another tracks worse than expected. Decision Mechanism: The brain represents not only the average, but the complete spectrum of possible rewards. Explain risk assessment disorders in conditions such as depression or addiction.
Source: Harvard
Our brains make thousands of decisions each day, big and small. One of these decisions – from the most important things like choosing a restaurant to the more important things like pursuing a different career or moving to a new city, there are chances of better or worse outcomes. there is.
How does the brain risk and reward in making these calls? The answer to this question continues to confuse scientists, but new research conducted by researchers at Harvard Medical School and Harvard University offers interesting clues.
The study was published on February 19th and partially supported by federal funding, incorporating the concept of machine learning into mouse experiments and studying brain circuits that support reward-based decisions.
Scientists have discovered two groups of mouse brain cells. One helps mice learn about above-average results, while the other is associated with below-average results. In the experiment, these cells allowed the brain to measure the full range of rewards that could be related to selection.
“Our results suggest that mice, and even other mammals, appear to represent more granular details about risk and reward than before,” Harvard Medical School.
When confirmed in humans, the findings understand how the human brain makes reward-based decisions and what happens to the ability to determine risk and reward when reward circuits fail. We can provide a framework for doing so.
Machine learning illuminates reward-based decisions
Neuroscientists have long been interested in how the brain uses past experiences to make new decisions. However, according to Dragowitsch, many traditional theories about such decision-making cannot capture the complexity and nuances of actual behavior.
Drugowitsch uses an example of choosing a restaurant. If you’re in the mood to play safely, you might choose a restaurant with a menu that definitely experiences good things. Restaurants you know have a mix of exceptional dishes and subpardits.
In the example above, the range of restaurants’ products is quite different, but existing neuroscience theory considers them to be comparable when averaged, so it is equally predicted that you would choose between them. .
“I know this is not human and animal behavior. You can decide whether to play safely by seeking risk,” Drugwich said. “We have a sense of more than the average expected reward associated with choices.”
In recent years, machine learning researchers have developed decision-making theories that better capture any potential rewards associated with choices.
They incorporated this theory into a new machine learning algorithm that outweighs the alternative algorithms of Atari video games, and a new machine learning algorithm that outweighs other tasks where each decision has multiple possible outcomes.
“They basically asked what happens if they learn the average rewards for a particular action, instead of learning the algorithms’ average distribution and discover that performance has improved significantly.” said Dragowitsch.
In the 2020 Nature Paper, Professor Naoshige uchida, Harvard University professor of molecular and cell biology, and colleagues, reanalyze existing data to investigate whether this machine learning theory applies to neuroscience. I did.
Analysis showed that in mice the activity of the neurotransmitter dopamine, which plays a role in reward seeking, pleasure and motivation, corresponds to reward learning signals predicted by the algorithm.
In other words, Dragowitsch said the study suggested that the new algorithm is excellent at explaining dopamine activity.
How the mouse brain represents a range of rewards
In the new study, Drugowitsch worked with co-author Uchida to take it a step further. Together, they design experiments in mice to see how this process unfolds in a brain region called the ventral striatum and store information about rewards that may be related to the decision .
“Dopamine activity only provides learning signals for predicted rewards, but we wanted to find expressions of these learning rewards directly in our brains,” Dragowitsch said.
Researchers trained mice to associate different odors with rewards of different magnitudes. Essentially, we taught mice the range of possible outcomes of choices. They then exhibited odors in mice and observed licking behavior (more licking in anticipation of better rewards) while recording neural activity in the ventral striatum.
The team identified two different groups of neurons in the brain. One is what helps mice learn more about outcomes than expected, and what is linked to outcomes that are worse than expected.
“You can think of this as an optimist and a pessimist in the brain. Both give advice on what to do next,” explained Dragowitsch.
When researchers silenced “optimistic” neurons, mice showed behavior suggesting that they expected unattractive rewards. Conversely, when researchers silenced “pessimistic” neurons, mice behaved as if they were expecting a higher value of treatment.
“These two groups of brain cells work together to form a representation of the full distribution of potential rewards for decisions,” Dragowitsch said.
Researchers make decisions about what each initial option represents and how the brain applies to more general reasoning about the world when there is more uncertainty. I look at many future directions for my work, such as whether I’m going to do it.
Drugowitsch noted that further research is needed to confirm human results and adapt the findings to the complexities of human decision-making. However, based on the similarities between mice and human brains, he said that the study suggests how humans assess risks in decisions and that people with specific conditions such as depression and addiction I believe that it could already shed light on the reasons why such evaluations are suffering.
Author, Funding, Disclosure
Additional authors for this paper include Adam Lowet, Qiao Zheng, Melissa Meng and Sara Matias.
Funding: This study was funded by the National Institutes of Health (R01NS116753; F31NS124095), the Human Frontier Science Program (LT000801/2018), the Harvard Brain Science Initiative, and the Brain & Behavior Research Foundation.
About this decision-making and neuroscience research news
Author: Dennis Neelon
Source: Harvard
Contact: Dennis Nearon – Harvard
Image: Image credited to Neuroscience News
Original Research: Closed Access.
Jan Drugowitsch et al. Nature
Abstract
A partner striatal circuit for distribution reinforcement learning.
Machine learning research achieved significant performance improvements for a wide range of tasks by expanding learning goals from the average reward to the overall probability distribution of rewards. This is an approach known as distributed reinforcement learning (RL).
The midlobular dopamine system is thought to underlie RL in mammalian brains by updating the mean expression of striatal values, but neurons in this circuit provide information on higher moments of reward distribution. Little is known about whether or not to encode it, where and how it is done.
Here, to fill this gap, a mouse streak uses a high density probe (neuropixel) to perform a classical conditioning task in which reward, reward variance, and stimulus identity is manipulated independently. Physical activity was recorded.
In contrast to traditional RL accounts, robust evidence of abstract encoding of striatal variance was found. Chronic ablation of dopamine input disrupts these distributional representations of the striatum without interfering with mean coding.
Two-photon calcium imaging and optogenetics contribute to this code by preferentially encoding the right and left tails of the reward distribution of two major classes of striatal media neurons (D1 and D2) respectively. It has been revealed that he has done it.
We synthesize these findings into a new model of striatal and osteotitis metaphoric dopamine that exploits hostility between medium spiny neurons of D1 and D2 to enjoy the computational benefits of distributed RL.