1 |
Resourceful at Any Size: A Predictive Methodology Using Linguistic Corpus Metrics for Multi-Source Training in Neural Dependency Parsing
|
|
|
|
BASE
|
|
Show details
|
|
2 |
ASR and Human Recognition Errors: Predictability and Lexical Factors
|
|
|
|
BASE
|
|
Show details
|
|
6 |
The Language of Law: An Analysis of Gender and Turn-Taking in U.S. Supreme Court Oral Arguments
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Speech to Text to Semantics: A Sequence-to-Sequence System for Spoken Language Understanding
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Dialogical Signals of Stance Taking in Spontaneous Conversation
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Exploring Phone Recognition in Pre-verbal and Dysarthric Speech
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Enriching Scientific Paper Embeddings with Citation Context
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Labeling and Automatically Identifying Basic-Level Categories
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Exposing the hidden vocal channel: Analysis of vocal expression
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Three Cheers For Partisanship: Lexical Framing and Applause in U.S. Presidential Primary Debates
|
|
|
|
BASE
|
|
Show details
|
|
15 |
STREAMLInED Challenges: Aligning Research Interests with Shared Tasks
|
|
|
|
In: 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages, March 6-7, 2017. Honolulu, Hawai‘i, USA (2017)
|
|
BASE
|
|
Show details
|
|
16 |
The prosody of negative ‘yeah’
|
|
|
|
In: LSA Annual Meeting Extended Abstracts; Vol 6: LSA Annual Meeting Extended Abstracts 2015; 6:1-5 ; 2377-3367 (2015)
|
|
BASE
|
|
Show details
|
|
17 |
Detection of Agreement and Disagreement: An investigation of linguistic coordination and conversational features
|
|
|
|
BASE
|
|
Show details
|
|
18 |
An Independent Assessment of Phonetic Distinctive Feature Sets used to Model Pronunciation Variation
|
|
|
|
Abstract:
Thesis (Master's)--University of Washington, 2014 ; It has been consistently shown that Automatic Speech Recognition (ASR) performance on casual, spontaneous speech is much worse than on carefully planned or read speech by as much as double the word error rate, and that variation in pronunciation is the main reason for this degradation of performance. Thus far, any attempts to mitigate this have fallen well below expectations. Phonetic Distinctive Features show promise from a theoretical standpoint, but have thus far not been fully incorporated into an end-to-end ASR system. Work incorporating distinctive features into ASR is widespread and varied, and each project uses a unique set of features based on the authors' linguistic intuitions, so the results of these experiments cannot be fully and fairly compared. In this work, I attempt to determine which style of distinctive feature set is best suited to model pronunciation variation in ASR based on measures of surface phone prediction accuracy and efficiency of the decision tree model. Using a non-exhaustive, representative set of phonetic distinctive feature sets, decision trees were trained, one per canonical base form phone, under two experimental conditions: words in isolation, and words in sequence. These models were tested against a comparable held-out test set, and an additional data set of canonical pronunciations used to simulate formal speech. It was found that a multi-valued articulatory-based feature set provided a far more compact model that yielded comparable accuracy results, while in a comparison of binary feature sets, the model with feature redundancy provided a far more robust model, with slightly higher accuracy and, where it predicted an incorrect phone, it was closer to the actual gold standard phone than the other feature sets' predictions.
|
|
Keyword:
ASR; Distinctive Features; Linguistics; Pronunciation Modeling
|
|
URL: http://hdl.handle.net/1773/25371
|
|
BASE
|
|
Hide details
|
|
|
|