In the first experiment of its kind, scientists have been able to translate brain signals directly into intelligible speech. It may sound like wild science fiction at first, but this feat could actually help some people with speech issues.
And yes, we could also get some futuristic computer interfaces out of this.
Key to the system is an artificial intelligence algorithm that matches what the subject is hearing with patterns of electrical activity, and then turns them into speech that actually makes sense to the listener.
We know from previous research that when we speak (or even just imagine speaking), we get distinct patterns in the brain's neural networks. In this case, the system is decoding brain responses rather than actual thoughts into speech, but it has the potential to do that too, with enough development.
"Our voices help connect us to our friends, family and the world around us, which is why losing the power of one's voice due to injury or disease is so devastating," says one of the team, Nima Mesgarani from Columbia University in New York.
"With today's study, we have a potential way to restore that power. We've shown that, with the right technology, these people's thoughts could be decoded and understood by any listener."
The algorithm used is called a vocoder, the same type of algorithm that can synthesise speech after being trained on humans talking. When you get a response back from Siri or Amazon Alexa, it's a vocoder that's being deployed.
In other words, Amazon or Apple don't have to program every single word into their devices – they use the vocoder to create a realistic-sounding voice based on whatever text needs to be said.
Here, the vocoder wasn't trained by human speech but by neural activity in the auditory cortex part of the brain, measured in patients undergoing brain surgery as they listened to sentences being spoken out loud.
With that bank of data to draw on, brain signals recorded as the patients listened to the digits 0 to 9 being read out were run through the vocoder and cleaned up with the help of more AI analysis. They were found to closely match the sounds that had been heard – even if the final voice is still quite robotic.
The technique proved far more effective than previous efforts using simpler computer models on spectrogram images – visual representations of sound frequencies.
"We found that people could understand and repeat the sounds about 75 percent of the time, which is well above and beyond any previous attempts," says Mesgarani.
"The sensitive vocoder and powerful neural networks represented the sounds the patients had originally listened to with surprising accuracy."
There's a lot of work still to do, but the potential is huge. Again, it's worth emphasising that the system doesn't turn actual mental thoughts into spoken words, but it might be able to do that eventually – that's the next challenge the researchers want to tackle.
Further down the line you might even be able to think your emails on to the screen or turn on your smart lights just by issuing a mental command.
That will take time though, not least because all our brains work slightly differently – a large amount of training data from each person would be needed to accurately interpret all our thoughts.
In the not too distant future we're potentially talking about people getting a voice who don't already have one, whether they have locked-in syndrome, or are recovering from a stroke, or (as in the case of the late Stephen Hawking) have amyotrophic lateral sclerosis (ALS).
"In this scenario, if the wearer thinks 'I need a glass of water', our system could take the brain signals generated by that thought, and turn them into synthesised, verbal speech," says Mesgarani.
"This would be a game changer. It would give anyone who has lost their ability to speak, whether through injury or disease, the renewed chance to connect to the world around them."
The research has been published in Scientific Reports.