1 |
A Speech-Level–Based Segmented Model to Decode the Dynamic Auditory Attention States in the Competing Speaker Scenes
|
|
|
|
In: Front Neurosci (2022)
|
|
Abstract:
In the competing speaker environments, human listeners need to focus or switch their auditory attention according to dynamic intentions. The reliable cortical tracking ability to the speech envelope is an effective feature for decoding the target speech from the neural signals. Moreover, previous studies revealed that the root mean square (RMS)–level–based speech segmentation made a great contribution to the target speech perception with the modulation of sustained auditory attention. This study further investigated the effect of the RMS-level–based speech segmentation on the auditory attention decoding (AAD) performance with both sustained and switched attention in the competing speaker auditory scenes. Objective biomarkers derived from the cortical activities were also developed to index the dynamic auditory attention states. In the current study, subjects were asked to concentrate or switch their attention between two competing speaker streams. The neural responses to the higher- and lower-RMS-level speech segments were analyzed via the linear temporal response function (TRF) before and after the attention switching from one to the other speaker stream. Furthermore, the AAD performance decoded by the unified TRF decoding model was compared to that by the speech-RMS-level–based segmented decoding model with the dynamic change of the auditory attention states. The results showed that the weight of the typical TRF component approximately 100-ms time lag was sensitive to the switching of the auditory attention. Compared to the unified AAD model, the segmented AAD model improved attention decoding performance under both the sustained and switched auditory attention modulations in a wide range of signal-to-masker ratios (SMRs). In the competing speaker scenes, the TRF weight and AAD accuracy could be used as effective indicators to detect the changes of the auditory attention. In addition, with a wide range of SMRs (i.e., from 6 to –6 dB in this study), the segmented AAD model showed the robust decoding performance even with short decision window length, suggesting that this speech-RMS-level–based model has the potential to decode dynamic attention states in the realistic auditory scenarios.
|
|
Keyword:
Neuroscience
|
|
URL: https://doi.org/10.3389/fnins.2021.760611 http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8866945/
|
|
BASE
|
|
Hide details
|
|
4 |
The Development of Categorical Perception of Segments and Suprasegments in Mandarin-Speaking Preschoolers
|
|
|
|
In: Front Psychol (2021)
|
|
BASE
|
|
Show details
|
|
5 |
Book Review: Speech Perception, Production and Acquisition: Multidisciplinary Approaches in Chinese Languages
|
|
|
|
In: Front Psychol (2021)
|
|
BASE
|
|
Show details
|
|
6 |
A Review of Speech Perception of Mandarin-Speaking Children With Cochlear Implantation
|
|
|
|
In: Front Neurosci (2021)
|
|
BASE
|
|
Show details
|
|
7 |
Reduced Sensitivity to Between-Category Information but Preserved Categorical Perception of Lexical Tones in Tone Language Speakers With Congenital Amusia
|
|
|
|
In: Front Psychol (2020)
|
|
BASE
|
|
Show details
|
|
8 |
The time course of orthographic and semantic activation in Chinese character recognition: evidence from anERP study ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
The time course of orthographic and semantic activation in Chinese character recognition: evidence from anERP study ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Cantonese Tone Identification in Three Temporal Cues in Quiet, Speech-Shaped Noise and Two-Talker Babble
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Quantitative Assessment of Blood Pressure Measurement Accuracy and Variability from Visual Auscultation Method by Observers without Receiving Medical Training
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Assessing the effect of noise-reduction to the intelligibility of low-pass filtered speech
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Impact of SNR and Gain-Function Over- and Under-estimation on Speech Intelligibility
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Contributions of cochlea-scaled entropy and consonant-vowel boundaries to prediction of speech intelligibility in noise
|
|
|
|
BASE
|
|
Show details
|
|
|
|