DE eng

Search in the Catalogues and Directories

Page: 1 2
Hits 1 – 20 of 35

1
Speaking Style Variability in Speaker Discrimination by Humans and Machines
Afshan, Amber. - : eScholarship, University of California, 2022
Abstract: A speaker's voice constantly varies in everyday situations, such as when talking to a friend, reading aloud, talking to pets, or narrating a happy incident. These changes in speaking style affect human and machine abilities to distinguish speakers based on their voice. This dissertation studies the effects of speaking style variability on speaker discrimination performance by humans and machines.We compare human speaker discrimination performance for read speech versus casual conversations. Listeners perform better when stimuli are style-matched, particularly in read speech -- read speech trials. They perform the worst in style-mismatched conditions. Moderate style variability affects the "same speaker" task more than the "different speaker" task. The speakers who are "easy" or "hard" to "tell together" are not the same as those who are "easy" or "hard" to "tell apart." Analysis of acoustic variability suggests that listeners find it easier to "tell speakers together" when they rely on speaker-specific idiosyncrasies and that they "tell speakers apart" based on their relative positions within a shared acoustic space.The effects of style variability on automatic speaker verification (ASV) systems are systematically analyzed using the UCLA Speaker Variability database, which comprises multiple speaking styles per speaker. The performance is better when enrollment and test utterances are of the same style, but it substantially degrades when styles are mismatched. We hypothesize that between-frame entropy can capture style-related spectral and temporal variations. We propose an entropy-based variable frame rate (VFR) technique to address style variability in two different approaches: data augmentation and self-attentive conditioning. Both approaches improve performance in style-mismatch scenarios and are comparable in performance.Furthermore, humans and machines seem to employ different approaches to speaker discrimination. In an attempt to improve ASV performance in the presence of style variability, insights learnt from the human speaker perception experiments are used to design a training loss function, referred to as "CllrCE loss". CllrCE loss focuses on both speaker-specific idiosyncrasies and relative acoustic distances between the speakers to train the ASV system. This loss function improves ASV performance in case of style variability, especially in the case of moderate style variations from conversational speech.
Keyword: Acoustic space analysis; Computer engineering; Electrical engineering; Human speaker perception; Self-attention conditioning; Speaker verification; Speaking style; Variable frame rate
URL: https://escholarship.org/uc/item/3zh346jm
BASE
Hide details
2
An Efficient Method for Biomedical Entity Linking Based on Inter- and Intra-Entity Attention
In: Applied Sciences; Volume 12; Issue 6; Pages: 3191 (2022)
BASE
Show details
3
ТРУДНОСТИ ПРЕПОДАВАНИЯ АНГЛИЙСКОГО ЯЗЫКА В НЕЯЗЫКОВОМ ВУЗЕ ... : CHALLENGES OF TEACHING ENGLISH IN A NON-LINGUISTIC HIGHER SCHOOL ...
Л.Ю. Обухова. - : Мир науки, культуры, образования, 2021
BASE
Show details
4
A Neural N-Gram-Based Classifier for Chinese Clinical Named Entity Recognition
In: Applied Sciences ; Volume 11 ; Issue 18 (2021)
BASE
Show details
5
A Transformer-Based Neural Machine Translation Model for Arabic Dialects That Utilizes Subword Units
In: Sensors ; Volume 21 ; Issue 19 (2021)
BASE
Show details
6
High-intensity interval training upon cognitive and psychological outcomes in youth : a systematic review
BASE
Show details
7
When to Make the Sensory Social: Registering in Face-to-Face Openings
In: Faculty Publications (2020)
BASE
Show details
8
Extractive summarization using siamese hierarchical transformer encoders
BASE
Show details
9
Using Bidirectional Encoder Representations from Transformers for Conversational Machine Comprehension ; Användning av BERT-språkmodell för konversationsförståelse
Gogoulou, Evangelina. - : KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019
BASE
Show details
10
Improving Hybrid CTC/Attention Architecture with Time-Restricted Self-Attention CTC for End-to-End Speech Recognition
In: Applied Sciences ; Volume 9 ; Issue 21 (2019)
BASE
Show details
11
Exploring Efficient Neural Architectures for Linguistic–Acoustic Mapping in Text-To-Speech
In: Applied Sciences ; Volume 9 ; Issue 16 (2019)
BASE
Show details
12
When to make the sensory social: Registering in copresent openings
In: Communication Scholarship (2019)
BASE
Show details
13
Bridging the gap: attending to discontinuity in identification of multiword expressions
Mitkov, Ruslan; Kouchaki, Samaneh; Taslimipoor, Shiva. - : Association for Computational Linguistics, 2019
BASE
Show details
14
Deep neural networks for natural language processing and its acceleration
Lin, Zhouhan. - 2019
BASE
Show details
15
Light and heavy drinking in jurisdictions with different alcohol policy environments.
In: The International journal on drug policy, vol. 65, pp. 86-96 (2019)
BASE
Show details
16
Testing the Bilingual Advantage Hypothesis: Language Balance and Self-Regulation
Zweig, Hannah Victoria. - : University of Oregon, 2018
BASE
Show details
17
Propuesta de intervención psicoeducativa en un caso de dislexia
Bonavetti Bernal, Cynthia Anabel. - : Universitat Jaume I, 2018
BASE
Show details
18
Examining links between anxiety, reinvestment and walking when talking by older adults during adaptive gait
Young, WR; Olonilua, M; Masters, RSW. - : Springer Verlag (Germany), 2015
BASE
Show details
19
Orthographic learning during reading: the role of whole-word visual processing
In: ISSN: 0141-0423 ; EISSN: 1467-9817 ; Journal of Research in Reading ; https://hal.archives-ouvertes.fr/hal-01218316 ; Journal of Research in Reading, Wiley, 2015, 38, pp.141-158. ⟨10.1111/j.1467-9817.2012.01551.x⟩ (2015)
BASE
Show details
20
INDIVIDUAL DIFFERENCES IN PREDICTIVE PROCESSING: EVIDENCE FROM SUBJECT FILLED-GAP EFFECTS IN NATIVE AND NONNATIVE SPEAKERS OF ENGLISH
Johnson, Adrienne Marie. - : University of Kansas, 2015
BASE
Show details

Page: 1 2

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
1
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
34
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern