US researchers have managed to use non-invasive technology to identify what kind of text they are listening to or what they are silently imagining in the brain activity of test subjects. The system also reproduced the content of a silent film that the subjects were watching in text form.

The feat was achieved with the help of functional magnetic resonance imaging (fMRI), as the research team led by Alexander G. Huth at the University of Texas at Austin in the current issue of Nature Neuroscience reported. “This method is not exactly the first choice when it comes to measuring brain activity,” Huth explained at a press conference. Although the method offers a high spatial resolution, it is very sluggish – unlike electroencephalography (EEG), for example, with a temporal resolution in the millisecond range.

fMRI does not measure the activity of the neurons directly, but rather the blood flow it influences: the firing of the neurons requires energy, which affects the oxygen content of the blood. However, a neuronal impulse causes this signal, which is known as “blood-oxygen-level-dependent” (BOLD), to rise and fall again over about ten seconds. “Observing a brain in this way is a bit like trying to read the activities in a city from the distribution of light alone,” says Huth. The thought of trying to decode speech from these signals would have seemed downright ridiculous to any neuroscientist twenty years ago.

In ten seconds, a person in English usually speaks more than twenty words. It is therefore not possible to decode individual words from an fMRI image of the brain. Other systems that rely on the motor skills of speaking or writing by hand and reconstruct the movements of the tongue and lips from brain activity can do this better. However, these methods have difficulty rendering continuous text.

The system now presented, on the other hand, captures “something deeper than language,” says Huth. It works on the level of semantics and meaning: “We can’t recognize the exact words, but we can recognize the guiding idea, because it changes more slowly.” They were “shocked” at how well it worked. The great advances in language models that have been achieved in the last five and a half years have been decisive for the success. That’s how the Texas researchers use GPT—”the original version, not the current one,” as Huth points out.

The system was tested with three subjects who each listened to stories for a total of 16 hours while their brain activity was recorded with an MRI scanner. The stories, roughly ten minutes long, were mostly sourced from the storytelling website The Moth. They are entertaining and at the same time cover a wide variety of topics, emphasizes Huth: “If you want good fMRI data, you must not bore the test subjects.”

After this training, it was then tested how well the system could decode the content of previously unknown texts from the fMRI data recorded while listening. This was not only successful with heard texts, but also with those that the subjects only imagined. The system also captured the content of a short silent film sequence they were watching quite well.

The researchers are aware that their findings can be perceived as frightening and threatening. They therefore emphasize that this decoding of brain activity does not work without the cooperation of the test subjects. On the one hand, this applies to the complex training, but also to the decoding itself: if the subjects listened to a story and at the same time distracted themselves mentally – for example with arithmetic problems or the assignment of animal names – the system failed. The application of a system trained for one person to another individual does not work either. The method cannot be generalized.

However, Huth and his team are not hiding behind this current state of research, which is of course subject to change. It is not certain that an MRI scanner weighing more than seven tons and hours of complex training will continue to be required in the future to decode speech from brain activity. Of course, the technology could be misused, for example as a lie detector, explained team member Jerry Tang, urging these issues to be addressed “proactively.”

Another open question is to what extent the technology can serve as a communication aid for severely paralyzed patients who are otherwise unable to communicate through speech or gestures. To do so, it would need to be better at dealing with the subtleties of language, where small changes can dramatically change the meaning of what’s being said. The researchers write in their study that the combination with methods that target the motor skills of speech could help here. In addition, they want to carry out closed-loop experiments in the future, in which the subjects can see how the system decoded their thoughts. This gives users the opportunity to adapt to the system and possibly also optimize the voice output in this way.


(yeah)

To home page

California18

Welcome to California18, your number one source for Breaking News from the World. We’re dedicated to giving you the very best of News.

Leave a Reply