1 |
Cross-lingual and cross-domain evaluation of Machine Reading Comprehension with Squad and CALOR-Quest corpora
|
|
|
|
In: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020) ; LREC 2020 ; https://hal.archives-ouvertes.fr/hal-02973245 ; LREC 2020, May 2020, MARSEILLE, France. pp.5491-5497 ; https://lrec2020.lrec-conf.org/en/ (2020)
|
|
Abstract:
International audience ; Machine Reading received recently a lot of attention thanks to both the availability of very large corpora such as SQuAD or MS MARCO containing triplets (document, question, answer), and the introduction of Transformer Language Models such as BERT which obtains excellent results, even matching human performance according to the SQuAD leaderboard. One of the key features of Transformer Models is their ability to be jointly trained across multiple languages, using a shared subword vocabulary, leading to the construction of cross-lingual lexical representations. This feature has been used recently to perform zero-shot cross-lingual experiments where a multilingual BERT model fine-tuned on a machine reading comprehension task exclusively for English was directly applied to Chinese and French documents with interesting performance. In this paper we study the cross-language and cross-domain capabilities of BERT on a Machine Reading Comprehension task on two corpora: SQuAD and a new French Machine Reading dataset, called CALOR-QUEST. The semantic annotation available on CALOR-QUEST allows us to give a detailed analysis on the kind of questions that are properly handled through the cross-language process.
|
|
Keyword:
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]; [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing; [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing; cross-lingual; FrameNet; Machine Reading Comprehension
|
|
URL: https://hal.archives-ouvertes.fr/hal-02973245 https://hal.archives-ouvertes.fr/hal-02973245/document https://hal.archives-ouvertes.fr/hal-02973245/file/2020.lrec-1.674.pdf
|
|
BASE
|
|
Hide details
|
|
2 |
Analyse sémantique robuste par apprentissage antagoniste pour la généralisation de domaine
|
|
|
|
In: Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 4 : Démonstrations et résumés d'articles internationaux ; 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 4 : Démonstrations et résumés d'articles internationaux ; https://hal.archives-ouvertes.fr/hal-02768521 ; 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 4 : Démonstrations et résumés d'articles internationaux, 2020, Nancy, France. pp.71-72 (2020)
|
|
BASE
|
|
Show details
|
|
3 |
Adapting a FrameNet Semantic Parser for Spoken Language Understanding Using Adversarial Learning
|
|
|
|
In: Interspeech 2019 ; https://hal.archives-ouvertes.fr/hal-02298417 ; Interspeech 2019, Sep 2019, Graz, Austria. pp.799-803, ⟨10.21437/Interspeech.2019-2732⟩ (2019)
|
|
BASE
|
|
Show details
|
|
4 |
Robust Semantic Parsing with Adversarial Learning for Domain Generalization
|
|
|
|
In: Proceedings of the 2019 Conference of the North ; https://hal.archives-ouvertes.fr/hal-02298402 ; Proceedings of the 2019 Conference of the North, Jun 2019, Minneapolis - Minnesota, France. pp.166-173, ⟨10.18653/v1/N19-2021⟩ (2019)
|
|
BASE
|
|
Show details
|
|
5 |
The Impact of Word Representations on Sequential Neural MWE Identification
|
|
|
|
In: Proceedings of the Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019) ; Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019) ; https://hal.archives-ouvertes.fr/hal-02318287 ; Joint Workshop on Multiword Expressions and WordNet (MWE-WN 2019), Aug 2019, Florence, Italy. pp.169 - 175, ⟨10.18653/v1/W19-5121⟩ (2019)
|
|
BASE
|
|
Show details
|
|
6 |
Robust Semantic Parsing with Adversarial Learning for Domain Generalization ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Adapting a FrameNet Semantic Parser for Spoken Language Understanding Using Adversarial Learning ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Handling Normalization Issues for Part-of-Speech Tagging of Online Conversational Text
|
|
|
|
In: LREC proceedings ; Eleventh International Conference on Language Resources and Evaluation (LREC 2018) ; https://hal.archives-ouvertes.fr/hal-01943391 ; Eleventh International Conference on Language Resources and Evaluation (LREC 2018), 2018, Miyazaki, Japan (2018)
|
|
BASE
|
|
Show details
|
|
9 |
Prédiction de l'échec d'une conversation médiée dans un contexte de dialogues à rôles asymétriques
|
|
|
|
In: Vingt-cinquième conférence sur le Traitement Automatique des Langues Naturelles (TALN) ; https://hal.archives-ouvertes.fr/hal-01798604 ; Vingt-cinquième conférence sur le Traitement Automatique des Langues Naturelles (TALN), ATALA, May 2018, Rennes, France (2018)
|
|
BASE
|
|
Show details
|
|
10 |
Automatic transcription error recovery for Person Name Recognition
|
|
|
|
In: Interspeech 2012 ; https://hal.archives-ouvertes.fr/hal-02356295 ; Interspeech 2012, Sep 2012, Portland, United States (2012)
|
|
BASE
|
|
Show details
|
|
11 |
Detecting Person Presence in TV Shows with Linguistic and Structural Features
|
|
|
|
In: IEEE International Conference in Acoustics, Speech and Signal Processing (ICASSP), Kyoto (Japan) ; https://hal-amu.archives-ouvertes.fr/hal-01194256 ; IEEE International Conference in Acoustics, Speech and Signal Processing (ICASSP), Kyoto (Japan), 2012, Unknown, Unknown Region (2012)
|
|
BASE
|
|
Show details
|
|
13 |
Automatic customer feedback processing: alarm detection in open question spoken messages
|
|
|
|
In: Interspeech ; https://hal.archives-ouvertes.fr/hal-01314565 ; Interspeech, Sep 2008, brisbane, Australia (2008)
|
|
BASE
|
|
Show details
|
|
15 |
SPOKEN LANGUAGE UNDERSTANDING STRATEGIES ON THE FRANCE TELECOM 3000 VOICE AGENCY CORPUS Ge'raldine Damnati1, Fre'deric Bechet2
|
|
|
|
In: IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07 ; https://hal.archives-ouvertes.fr/hal-01317193 ; IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07 , Apr 2007, Honolulu, United States (2007)
|
|
BASE
|
|
Show details
|
|
16 |
Experiments on the France Telecom 3000 Voice Agency corpus: academic research on an industrial spoken dialog system
|
|
|
|
In: NAACL-HLT-Dialog '07 ; https://hal.archives-ouvertes.fr/hal-01317176 ; NAACL-HLT-Dialog '07, Apr 2007, Rochester, United States (2007)
|
|
BASE
|
|
Show details
|
|
17 |
Conditional use of Word Lattices, Confusion Networks and 1-best string hypotheses in a Sequential Interpretation Strategy
|
|
|
|
In: Interspeech ; https://hal.archives-ouvertes.fr/hal-01312935 ; Interspeech, Aug 2007, Anvers, Belgium (2007)
|
|
BASE
|
|
Show details
|
|
|
|