1 |
Transdisciplinary Analysis of a Corpus of French Newsreels: The ANTRACT Project
|
|
|
|
In: ISSN: 1938-4122 ; Digital Humanities Quarterly ; https://hal.archives-ouvertes.fr/hal-03166755 ; Digital Humanities Quarterly, Alliance of Digital Humanities, 2021, Special Issue on AudioVisual Data in DH, 15 (1) ; http://digitalhumanities.org/dhq/ (2021)
|
|
BASE
|
|
Show details
|
|
3 |
CRF based context modeling for person identification in broadcast videos
|
|
|
|
In: ISSN: 2297-198X ; EISSN: 2297-198X ; Frontiers in information and communication technologies ; https://hal.archives-ouvertes.fr/hal-01433154 ; Frontiers in information and communication technologies, Frontiers Media S.A., 2016, 3, pp.9. ⟨10.3389/fict.2016.00009⟩ ; http://journal.frontiersin.org (2016)
|
|
BASE
|
|
Show details
|
|
4 |
Iterative PLDA Adaptation for Speaker Diarization
|
|
|
|
In: Interspeech 2016 ; https://hal.archives-ouvertes.fr/hal-01433172 ; Interspeech 2016, Sep 2016, San Francisco, United States. pp.2175 - 2179, ⟨10.21437/Interspeech.2016-572⟩ (2016)
|
|
BASE
|
|
Show details
|
|
5 |
What Makes a Speaker Recognizable in TV Broadcast? Going Beyond Speaker Identification Error Rate
|
|
|
|
In: Interspeech 2015 ; ERRARE Workshop, a satellite event of Interspeech 2015. ; https://hal.archives-ouvertes.fr/hal-01433205 ; ERRARE Workshop, a satellite event of Interspeech 2015., 2015, Sinaia, Romania (2015)
|
|
Abstract:
International audience ; Speaker identification approaches for TV broadcast are usually evaluated and compared based on global error rates derived from the overall duration of missed detection, false alarm and confusion. Based on the analysis of the output of the systems submitted to the final round of the French evaluation campaign REPERE, this paper highlights the fact that these average met-rics lead to the incorrect intuition that current state-of-the-art algorithms partially recognize all speakers. Setting aside incorrect diarization and adverse acoustic conditions, we show that their performance is in fact essentially bi-modal: in a given show, either all speech turns of a speaker are correctly identified or none of them are. We then proceed with trying to understand and explain this behavior, through perfomance prediction experiments. These experiments show that the most discriminant speaker characteristics are – first – their total speech duration in the current show and – then only – the amount of training data available to build their acoustic model.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; error analysis; speaker recognition; TV broadcast
|
|
URL: https://hal.archives-ouvertes.fr/hal-01433205/document https://hal.archives-ouvertes.fr/hal-01433205 https://hal.archives-ouvertes.fr/hal-01433205/file/Charlet2015.pdf
|
|
BASE
|
|
Hide details
|
|
7 |
Improving recognition of proper nouns (in ASR) through generation and filtering of phonetic transcriptions
|
|
|
|
In: ISSN: 0885-2308 ; EISSN: 1095-8363 ; Computer Speech and Language ; https://hal.archives-ouvertes.fr/hal-01433238 ; Computer Speech and Language, Elsevier, 2014, 28 (4), pp.979-996. ⟨10.1016/j.csl.2014.02.006⟩ (2014)
|
|
BASE
|
|
Show details
|
|
8 |
Semi-Supervised and Unsupervised Data Extraction Targeting Speakers: From Speaker Roles to Fame?
|
|
|
|
In: Proceedings of the First Workshop on Speech, Language and Audio in Multimedia (SLAM), ; Interspeech satellite workshop on Speech, Language and Audio in Multimedia (SLAM) ; https://hal.archives-ouvertes.fr/hal-01433450 ; Interspeech satellite workshop on Speech, Language and Audio in Multimedia (SLAM), 2013, Marseille, France (2013)
|
|
BASE
|
|
Show details
|
|
9 |
Incorporating Named Entity Recognition into the Speech Transcription Process
|
|
|
|
In: Interspeech 2013 ; Interspeech ; https://hal.archives-ouvertes.fr/hal-01433438 ; Interspeech, 2013, Lyon, France (2013)
|
|
BASE
|
|
Show details
|
|
11 |
Acoustics-Based Phonetic Transcription Method for Proper Nouns
|
|
|
|
In: International Conference on Spoken Language Processing (ISCA, Interspeech 2010) ; https://hal.archives-ouvertes.fr/hal-01433899 ; International Conference on Spoken Language Processing (ISCA, Interspeech 2010), 2010, Japon (Makuhari), Unknown Region (2010)
|
|
BASE
|
|
Show details
|
|
12 |
Iterative filtering of phonetic transcriptions of proper nouns
|
|
|
|
In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2009) ; https://hal.archives-ouvertes.fr/hal-01433945 ; IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2009), 2009, Taipei, Taiwan. pp.4265--4268 (2009)
|
|
BASE
|
|
Show details
|
|
13 |
Grapheme to phoneme conversion using an SMT system
|
|
|
|
In: INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION. ANNUAL CONFERENCE. 10TH 2009. (INTERSPEECH 2009) ; 10th Annual Conference of the International Speech Communication Association 2009 (INTERSPEECH 2009) ; https://hal.archives-ouvertes.fr/hal-01451534 ; 10th Annual Conference of the International Speech Communication Association 2009 (INTERSPEECH 2009) , Sep 2009, Brighton, United Kingdom. pp.716-719 (2009)
|
|
BASE
|
|
Show details
|
|
14 |
Etude pour l’amélioration d’un système d’identification nommée du locuteur
|
|
|
|
In: Actes JEP-TALN 2010 ; Journées d'Etude de la Parole ; https://hal.archives-ouvertes.fr/hal-00412340 ; Journées d'Etude de la Parole, Jun 2008, Avignon, France. pp.10 (2008)
|
|
BASE
|
|
Show details
|
|
15 |
Combinaison de systèmes pour la phonétisation automatique de noms propres
|
|
|
|
In: XXVIIe Journées d'étude sur la parole (JEP 2008) ; https://hal.archives-ouvertes.fr/hal-01450912 ; XXVIIe Journées d'étude sur la parole (JEP 2008), Jun 2008, Avignon, France. pp.4 (2008)
|
|
BASE
|
|
Show details
|
|
16 |
Combined systems for automatic phonetic transcription of proper nouns
|
|
|
|
In: LREC 2008 Proceedings ; 6th Language Evaluation and Resources Conference (LREC 2008) ; https://hal.archives-ouvertes.fr/hal-01433960 ; 6th Language Evaluation and Resources Conference (LREC 2008), May 2008, Marrakech, Morocco. pp.1791-1795 ; http://www.lrec-conf.org/lrec2008/ (2008)
|
|
BASE
|
|
Show details
|
|
17 |
Extracting true speaker identities from transcriptions
|
|
|
|
In: Interspeech 2007 ; https://hal.archives-ouvertes.fr/hal-01434096 ; Interspeech 2007, 2007, Antwerp, Belgium (2007)
|
|
BASE
|
|
Show details
|
|
20 |
Speaker diarization: about whom the speaker is talking?
|
|
|
|
In: 2006 IEEE Odyssey - The Speaker and Language Recognition Workshop ; IEEE Speaker Odyssey 2006 ; https://hal.archives-ouvertes.fr/hal-01434121 ; IEEE Speaker Odyssey 2006, 2006, San Juan Puerto Rico (2006)
|
|
BASE
|
|
Show details
|
|
|
|