ABSTRACT: Researchers have developed a brain-computer interface that can synthesize the natural sound discourse of brain activity in real time, restoring a voice to people with severe paralysis. The system decodes the signals of the motor cortex and uses AI to transform them into an audible speech with a minimum delay, less than a second.
Unlike previous systems, this method preserves fluency and allows continuous speech, even generating personalized voices. This advance brings scientists to give people with loss of speech the ability to communicate in real time, using only their brain activity.
Key facts:
Speech almost in real time: the new BCI technology transmits an intelligible speech within 1 second. Personalized voice: The system uses recordings prior to the lesion to synthesize the user’s voice. Development flexibility: It works in multiple brain detection technologies, including non -invasive options.
Source: UC Berkeley
Marking an advance in the field of cerebral computer interfaces (BCIS), a team of researchers from UC Berkeley and UC San Francisco has unlocked a way of restoring the naturalistic discourse for people with severe paralysis.
This work solves the long -standing challenge of latency in speech neuroprosis, time between when a subject tries to speak and when the sound occurs. Using recent advances in modeling based on artificial intelligence, researchers developed a transmission method that synthesizes brain signals in audible speech in almost real time.

As reported in the neuroscience of nature, this technology represents a critical step to allow communication for people who have lost the ability to speak. The study has the support of the National Institute on Deafness and other communication disorders (NIDCD) of the National Health Institutes.
“Our transmission approach brings the same rapid capacity to decoding the speech of devices such as Alexa and Siri to neuroprothesis,” said Gopala Anumanchipalli, Robert E. and Beverly A. Brooks Assistant Professor of Electrical Engineering and Computer Science in UC Berkeley and Co-Involve researcher of the study.
“Using a similar type of algorithm, we find that we could decode neuronal data and, for the first time, enable almost synchronous voice transmission. The result is a more naturalistic and fluid speech synthesis.”
“This new technology has enormous potential to improve the quality of life of people living with severe paralysis that affect speech,” said neurosurgeon Edward Chang, principal investigator of the study.
Chang leads the clinical trial in UCSF that aims to develop speech neuroprothesis technology using high density electrodes matrices that record neuronal activity directly from the brain surface.
“It is exciting that the latest advances in great accelerate BCIS for the practical use of the real world in the near future.”
The researchers also showed that their approach can work well with a variety of other brain detection interfaces, including microelectrodes matrices (measurements) in which electrodes penetrate the surface of the brain, or non -invasive recordings (SEMG) that use sensors in the face to measure muscle activity.
“By demonstrating the precise synthesis from brain to the voice in other sets of silent voice data, we show that this technique is not limited to a specific type of device,” said Kaylo Littlejohn, Ph.D. Student of the Department of Electrical Engineering and Computing Sciences of UC Berkeley and co-leader author of the study.
“The same algorithm can be used in different modalities, provided there is a good sign.”
Decode neuronal data in speech
According to the study of the study Cheol Jun Cho, UC Berkeley Ph.D. Student in Electrical Engineering and Computer Science, neuroprothesis works by sampling neuronal data from the motor cortex, the part of the brain that controls speech production, then uses AI to decode the brain function in speech.
“We are essentially intercepting signals where thought translates into articulation and in the middle of that motor control,” he said.
“Then, what we are decoding is after a thought happened, after having decided what to say, after having decided what words use and how to move our muscles of the vocal tract.”
To collect the necessary data to train their algorithm, the researchers first had Ann, their subject, look at a message on the screen, as the phrase: “Hey, how are you?” – And then try to speak that prayer silently.
“This gave us a mapping between the neuronal activity windows that generates and the objective prayer that she is trying to say, without her needing to vocalize at any time,” said Littlejohn.
Because Ann has no residual vocalization, researchers had no objective audio or output, to which the neuronal data could be assigned, the entrance. They resolved this challenge using AI to complete the missing details.
“We use a speech text model prior to voice to generate audio and simulate a goal,” Cho said. “And we also use the voice prior to Ann’s injury, so when we decode the exit, it sounds more like her.”
Transmit speech in real time
In their previous BCI study, the researchers had a long latency to decode, approximately a delay of 8 seconds for a single sentence. With the new transmission approach, the audible output can be generated in an almost really real time, since the subject tries to speak.
To measure latency, researchers used speech detection methods, which allowed them to identify the brain signals that indicate the beginning of a speech attempt.
“We can see in relation to that sign of intention, in 1 second, we are obtaining the first sound,” said Anumanchipalli. “And the device can continually decode speech, so Ann can continue talking without interruption.”
This higher speed did not reach precision. The fastest interface delivered the same high level of decoding precision as its previous approach and not running.
“That is promising to see,” said Littlejohn. “Previously, it was not known if the intelligible discourse could be transmitted from the brain in real time.”
Anumanchipalli added that researchers do not always know if large -scale AI systems are learning and adapting, or simply coinciding and repeating parts of training data. Then, the researchers also tested the capacity of the model in real time to synthesize words that were not part of the vocabulary of the training data set, in this case, 26 rare words taken from NATO phonetic alphabet, such as “Alfa”, “Bravo”, “Charlie”, etc.
“We wanted to see if we could generalize the invisible words and really decode the patterns of talking about Ann,” he said.
“We discover that our model does this well, which shows that in fact the basic components of sound or voice is learning.”
Ann, who also participated in the 2023 study, shared with the researchers how his experience with the new transmission synthesis approach compared to the text decoding method to the voice of the previous study.
“She transmitted that transmission synthesis was a more voluntarily controlled modality,” said Anumanchipalli. “Listening to his own voice in time almost really increased his sense of incarnation.”
Future addresses
This last work brings researchers a step closer to achieving a naturalistic speech with BCI devices, while feeling the foundations for future advances.
“This concept proof frame is a great advance,” Cho said. “We are optimistic that we can now make advances at all levels. On the engineering side, for example, we will continue to press the algorithm to see how we can generate the best and faster speech.”
Researchers also remain focused on developing expressiveness in the output voice to reflect the changes in the tone, tone or volume that occur during speech, such as when someone is excited.
“That is a continuous job, to try to see how well we can decode these paralinguistic characteristics of brain activity,” Littlejohn said. “This is a long -standing problem even in the classic audio synthesis fields and would close the gap towards total and complete naturalism.”
FINANCING: In addition to the NIDCD, the MONshot Research and Development Program of the Science and Technology Agency of the Science and Technology Agency of Japan, the Joan and Sandy Weill Foundation, Susan and Bill Oberndorf, Ron Conway, Graham and Christina Spence National Foundation of Sciences of Science.
On this research news by AI and BCI
Author: Marni Ellery
Source: UC Berkeley
Contact: Marni Ellery – UC Berkeley
Image: The image is accredited to Neuroscience News
Original research: closed access.
“A brain neuroprothesis to the voice to restore the naturalistic communication” by Gopala Anumanchipalli et al. Nature neuroscience
Abstract
A neuroprosthesis from brain to voice to restore naturalistic communication
Speech natural communication occurs instantaneously. Speech delays more than a few seconds can interrupt the natural flow of conversation. This makes people with paralysis participate in a significant dialogue, which potentially leads to feelings of isolation and frustration.
Here we use superficial recordings of high density of the sensoriomotor cortex of speech in a clinical trial participant with severe and anarthria paralysis to boost a continuous naturalistic voice synthesizer.
We design and use models of transducers of recurrent neuronal networks of deep learning to achieve a synthesis of intelligible fluid speech online of large personalized vocabulary to the voice prior to the participant with neuronal decoding in increases of 80 ms.
Without connection, the models demonstrated implicit speech detection capabilities and could continually decode speech indefinitely, allowing uninterrupted use of the decoder and a higher speed increasing.
Our framework also successfully generalized other silent voice interfaces, including single -unit recordings and electromyography.
Our findings introduce a neuroprosthetic speech paradigm to restore naturalist spoken communication to people with paralysis.