1 |
Hippocampal and auditory contributions to speech segmentation
|
|
|
|
In: ISSN: 0010-9452 ; Cortex ; https://hal.archives-ouvertes.fr/hal-03604957 ; Cortex, Elsevier, 2022, ⟨10.1016/j.cortex.2022.01.017⟩ (2022)
|
|
BASE
|
|
Show details
|
|
2 |
Speaking clearly improves speech segmentation by statistical learning under optimal listening conditions ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
The effect of lengthening aspiration on speech segmentation ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
End-to-end speaker segmentation for overlap-aware resegmentation
|
|
|
|
In: Interspeech 2021 ; https://hal-univ-lemans.archives-ouvertes.fr/hal-03257524 ; Interspeech 2021, Aug 2021, Brno, Czech Republic ; https://www.interspeech2021.org/ (2021)
|
|
Abstract:
International audience ; Speaker segmentation consists in partitioning a conversation between one or more speakers into speaker turns. Usually addressed as the late combination of three sub-tasks (voice activity detection, speaker change detection, and overlapped speech detection), we propose to train an end-to-end segmentation model that does it directly. Inspired by the original end-to-end neural speaker diarization approach (EEND), the task is modeled as a multi-label classification problem using permutation-invariant training. The main difference is that our model operates on short audio chunks (5 seconds) but at a much higher temporal resolution (every 16ms). Experiments on multiple speaker diarization datasets conclude that our model can be used with great success on both voice activity detection and overlapped speech detection. Our proposed model can also be used as a post-processing step, to detect and correctly assign overlapped speech regions. Relative diarization error rate improvement over the best considered baseline (VBx) reaches 17% on AMI, 13% on DIHARD 3, and 13% on VoxConverse.
|
|
Keyword:
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]; [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE]; overlapped speech detection; resegmentation; speaker diarization; speaker segmentation; voice activity detection
|
|
URL: https://hal-univ-lemans.archives-ouvertes.fr/hal-03257524/document https://hal-univ-lemans.archives-ouvertes.fr/hal-03257524/file/2104.04045.pdf https://hal-univ-lemans.archives-ouvertes.fr/hal-03257524
|
|
BASE
|
|
Hide details
|
|
5 |
Impact of Encoding and Segmentation Strategies on End-to-End Simultaneous Speech Translation
|
|
|
|
In: INTERSPEECH 2021 ; https://hal.archives-ouvertes.fr/hal-03372487 ; INTERSPEECH 2021, Aug 2021, Brno, Czech Republic (2021)
|
|
BASE
|
|
Show details
|
|
6 |
Oscillatory activity and EEG phase synchrony of concurrent word segmentation and meaning-mapping in 9-year-old children
|
|
|
|
In: ISSN: 1878-9293 ; EISSN: 1878-9307 ; Developmental Cognitive Neuroscience ; https://hal.archives-ouvertes.fr/hal-03334735 ; Developmental Cognitive Neuroscience, Elsevier, 2021, 51, pp.101010. ⟨10.1016/j.dcn.2021.101010⟩ (2021)
|
|
BASE
|
|
Show details
|
|
7 |
Transdisciplinary Analysis of a Corpus of French Newsreels: The ANTRACT Project
|
|
|
|
In: ISSN: 1938-4122 ; Digital Humanities Quarterly ; https://hal.archives-ouvertes.fr/hal-03166755 ; Digital Humanities Quarterly, Alliance of Digital Humanities, 2021, Special Issue on AudioVisual Data in DH, 15 (1) ; http://digitalhumanities.org/dhq/ (2021)
|
|
BASE
|
|
Show details
|
|
8 |
Speaking clearly improves speech segmentation by statistical learning under optimal listening conditions
|
|
|
|
In: Laboratory Phonology: Journal of the Association for Laboratory Phonology; Vol 12, No 1 (2021); 14 ; 1868-6354 (2021)
|
|
BASE
|
|
Show details
|
|
11 |
Production of nonce words to establish the cues for prominence and grouping in English ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Early Tashelhiyt Berber word segmentation: the role of the Possible Word Constraint ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
The Iambic Trochaic Law in speech: The case of Japanese ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Developing Core Technologies for Resource-Scarce Nguni Languages
|
|
|
|
In: Information; Volume 12; Issue 12; Pages: 520 (2021)
|
|
BASE
|
|
Show details
|
|
17 |
Discovering structure in speech recordings: Unsupervised learning of word and phoneme like units for automatic speech recognition
|
|
|
|
In: Fraunhofer IAIS (2021)
|
|
BASE
|
|
Show details
|
|
18 |
Projecting action spaces. On the interactional relevance of cesural areas in co-enactments
|
|
|
|
In: Open Linguistics, Vol 7, Iss 1, Pp 638-665 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
19 |
Streaming cascade-based speech translation leveraged by a direct segmentation model
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Developing Resources for Automated Speech Processing of Quebec French
|
|
|
|
In: Proceedings of the 12th Language Resources and Evaluation Conference ; 12th Language Resources and Evaluation Conference ; https://hal.archives-ouvertes.fr/hal-03042864 ; 12th Language Resources and Evaluation Conference, 2020, marseille, France. pp.5323-5328 (2020)
|
|
BASE
|
|
Show details
|
|
|
|