1 |
A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images ...
|
|
Yongwan Lim; Toutios, Asterios; Bliesener, Yannick; Tian, Ye; Sajan Goud Lingala; Vaz, Colin; Sorensen, Tanner; Oh, Miran; Harper, Sarah; Weiyi Chen; Yoonjeong Lee; Töger, Johannes; Mairym Lloréns Montesserin; Smith, Caitlin; Godinez, Bianca; Goldstein, Louis; Byrd, Dani; Nayak, Krishna S; Shrikanth Narayanan. - : figshare, 2021
|
|
Abstract:
Real-time magnetic resonance imaging (RT-MRI) of human speech production is enabling significant advances in speech science, linguistics, bio-inspired speech technology development, and clinical applications. Easy access to RT-MRI is however limited, and comprehensive datasets with broad access are needed to catalyze research across numerous domains. The imaging of the rapidly moving articulators and dynamic airway shaping during speech demands high spatio-temporal resolution and robust reconstruction methods. Further, while reconstructed images have been published, to-date there is no open dataset providing raw multi-coil RT-MRI data from an optimized speech production experimental setup. Such datasets could enable new and improved methods for dynamic image reconstruction, artifact correction, feature extraction, and direct extraction of linguistically-relevant biomarkers. The present dataset offers a unique corpus of 2D sagittal-view RT-MRI videos along with synchronized audio for 75 subjects performing ...
|
|
Keyword:
170203 Knowledge Representation and Machine Learning; 170204 Linguistic Processes incl. Speech Production and Comprehension; 200404 Laboratory Phonetics and Speech Science; 80106 Image Processing; 90303 Biomedical Instrumentation; 90304 Medical Devices; 90609 Signal Processing; Artificial Intelligence and Image Processing; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; FOS Languages and literature; FOS Medical engineering; FOS Psychology; Linguistics
|
|
URL: https://dx.doi.org/10.6084/m9.figshare.13725546 https://figshare.com/articles/dataset/A_multispeaker_dataset_of_raw_and_reconstructed_speech_production_real-time_MRI_video_and_3D_volumetric_images/13725546
|
|
BASE
|
|
Hide details
|
|
2 |
A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
The Twins corpus of museum visitor questions
|
|
|
|
In: http://people.ict.usc.edu/~traum/Papers/twins-corpus.pdf (2012)
|
|
BASE
|
|
Show details
|
|
6 |
Morphological variation in the adult vocal tract: A study using rtmri
|
|
|
|
In: http://mproctor.net/docs/lammert11_IS2011_morphology.pdf (2011)
|
|
BASE
|
|
Show details
|
|
7 |
Direct Estimation of Articulatory Kinematics from Real-time Magnetic Resonance Image Sequences
|
|
|
|
In: http://mproctor.net/docs/proctor11_IS2011_constriction.pdf (2011)
|
|
BASE
|
|
Show details
|
|
8 |
A multimodal real-time MRI articulatory corpus for speech research
|
|
|
|
In: http://mproctor.net/docs/narayanan11_IS2011_MRI-TIMIT.pdf (2011)
|
|
BASE
|
|
Show details
|
|
9 |
Para-linguistic mechanisms of production in human ’beatboxing’: a real-time mri study
|
|
|
|
In: http://mproctor.net/docs/proctor10_IS2010_beatboxing.pdf (2010)
|
|
BASE
|
|
Show details
|
|
10 |
Rapid semiautomatic segmentation of real-time Magnetic Resonance Images for parametric vocal tract analysis
|
|
|
|
In: http://mproctor.net/docs/proctor10_IS2010_segmentation.pdf (2010)
|
|
BASE
|
|
Show details
|
|
11 |
Connecting rhythm and prominence in automatic ESL pronunciation scoring
|
|
|
|
In: http://www.josephtepperman.com/nava_rhythm_2009.pdf (2009)
|
|
BASE
|
|
Show details
|
|
12 |
Analysis of emotionally salient aspects of fundamental frequency for emotion detection
|
|
|
|
In: http://www.utdallas.edu/dept/eecs/research/researchlabs/msp-lab/publications/Busso_2009.pdf (2009)
|
|
BASE
|
|
Show details
|
|
13 |
Automatic classification of question turns in spontaneous speech using lexical and prosodic evidence
|
|
|
|
In: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2631211/pdf/nihms54014.pdf (2008)
|
|
BASE
|
|
Show details
|
|
14 |
An empirical analysis of user uncertainty in problem-solving child–machine interactions
|
|
|
|
In: http://care.usc.edu/research/Papers/MattJeannette-Uncertainty.pdf (2008)
|
|
BASE
|
|
Show details
|
|
15 |
A text-free approach to assessing nonnative intonation
|
|
|
|
In: http://sail.usc.edu/publications/tepperman_intonation_icslp07.pdf (2007)
|
|
BASE
|
|
Show details
|
|
16 |
An English-Persian Automatic Speech Translator: Recent Developments in Domain Portability and User Modeling
|
|
|
|
In: http://sail.usc.edu/publications/georgiou_isyc2006_s2s.pdf (2006)
|
|
BASE
|
|
Show details
|
|
17 |
Abstract
|
|
|
|
In: http://sail.usc.edu/publications/Sethy-SpeechComm2006.pdf (2006)
|
|
BASE
|
|
Show details
|
|
18 |
Combining acoustic, lexical, and syntactic evidence for automatic unsupervised prosody labeling
|
|
|
|
In: http://sail.usc.edu/publications/ananthak_icslp06.pdf (2006)
|
|
BASE
|
|
Show details
|
|
19 |
Detection of Non-Native Named Entities Using Prosodic Features for Improved Speech Recognition and Translation
|
|
|
|
In: http://isca-speech.org/archive_open/archive_papers/ml06/ml06_003.pdf (2006)
|
|
BASE
|
|
Show details
|
|
20 |
Vector-based representation and clustering of audio using onomatopoeia words
|
|
|
|
In: https://www.aaai.org/Papers/Symposia/Fall/2006/FS-06-01/FS06-01-012.pdf (2006)
|
|
BASE
|
|
Show details
|
|
|
|