1 |
Cross-lingual and cross-domain evaluation of Machine Reading Comprehension with Squad and CALOR-Quest corpora
|
|
|
|
In: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020) ; LREC 2020 ; https://hal.archives-ouvertes.fr/hal-02973245 ; LREC 2020, May 2020, MARSEILLE, France. pp.5491-5497 ; https://lrec2020.lrec-conf.org/en/ (2020)
|
|
Abstract:
International audience ; Machine Reading received recently a lot of attention thanks to both the availability of very large corpora such as SQuAD or MS MARCO containing triplets (document, question, answer), and the introduction of Transformer Language Models such as BERT which obtains excellent results, even matching human performance according to the SQuAD leaderboard. One of the key features of Transformer Models is their ability to be jointly trained across multiple languages, using a shared subword vocabulary, leading to the construction of cross-lingual lexical representations. This feature has been used recently to perform zero-shot cross-lingual experiments where a multilingual BERT model fine-tuned on a machine reading comprehension task exclusively for English was directly applied to Chinese and French documents with interesting performance. In this paper we study the cross-language and cross-domain capabilities of BERT on a Machine Reading Comprehension task on two corpora: SQuAD and a new French Machine Reading dataset, called CALOR-QUEST. The semantic annotation available on CALOR-QUEST allows us to give a detailed analysis on the kind of questions that are properly handled through the cross-language process.
|
|
Keyword:
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]; [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing; [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing; cross-lingual; FrameNet; Machine Reading Comprehension
|
|
URL: https://hal.archives-ouvertes.fr/hal-02973245 https://hal.archives-ouvertes.fr/hal-02973245/document https://hal.archives-ouvertes.fr/hal-02973245/file/2020.lrec-1.674.pdf
|
|
BASE
|
|
Hide details
|
|
2 |
Handling Normalization Issues for Part-of-Speech Tagging of Online Conversational Text
|
|
|
|
In: LREC proceedings ; Eleventh International Conference on Language Resources and Evaluation (LREC 2018) ; https://hal.archives-ouvertes.fr/hal-01943391 ; Eleventh International Conference on Language Resources and Evaluation (LREC 2018), 2018, Miyazaki, Japan (2018)
|
|
BASE
|
|
Show details
|
|
3 |
Prédiction de l'échec d'une conversation médiée dans un contexte de dialogues à rôles asymétriques
|
|
|
|
In: Vingt-cinquième conférence sur le Traitement Automatique des Langues Naturelles (TALN) ; https://hal.archives-ouvertes.fr/hal-01798604 ; Vingt-cinquième conférence sur le Traitement Automatique des Langues Naturelles (TALN), ATALA, May 2018, Rennes, France (2018)
|
|
BASE
|
|
Show details
|
|
4 |
Iterative PLDA Adaptation for Speaker Diarization
|
|
|
|
In: Interspeech 2016 ; https://hal.archives-ouvertes.fr/hal-01433172 ; Interspeech 2016, Sep 2016, San Francisco, United States. pp.2175 - 2179, ⟨10.21437/Interspeech.2016-572⟩ (2016)
|
|
BASE
|
|
Show details
|
|
5 |
What Makes a Speaker Recognizable in TV Broadcast? Going Beyond Speaker Identification Error Rate
|
|
|
|
In: Interspeech 2015 ; ERRARE Workshop, a satellite event of Interspeech 2015. ; https://hal.archives-ouvertes.fr/hal-01433205 ; ERRARE Workshop, a satellite event of Interspeech 2015., 2015, Sinaia, Romania (2015)
|
|
BASE
|
|
Show details
|
|
6 |
Impact of overlapping speech detection on speaker diarization for broadcast news and debates
|
|
|
|
In: IEEE International Conference on Acoustics, Speech, and Signal Processing ; https://hal.archives-ouvertes.fr/hal-01836475 ; IEEE International Conference on Acoustics, Speech, and Signal Processing, Jan 2013, Vancouver, Canada (2013)
|
|
BASE
|
|
Show details
|
|
7 |
Automatic transcription error recovery for Person Name Recognition
|
|
|
|
In: Interspeech 2012 ; https://hal.archives-ouvertes.fr/hal-02356295 ; Interspeech 2012, Sep 2012, Portland, United States (2012)
|
|
BASE
|
|
Show details
|
|
|
|