3 |
Differential Evaluation: a Qualitative Analysis of Natural Language Processing System Behavior Based Upon Data Resistance to Processing
|
|
|
|
In: Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems ; EVAL4NLP, 2nd Workshop on "Evaluation & Comparison of NLP Systems", EMNLP 2021 ; https://hal.archives-ouvertes.fr/hal-03432331 ; EVAL4NLP, 2nd Workshop on "Evaluation & Comparison of NLP Systems", EMNLP 2021, Nov 2021, Punta Cana, Dominican Republic (2021)
|
|
BASE
|
|
Show details
|
|
5 |
DiaBLa: A Corpus of Bilingual Spontaneous Written Dialogues for Machine Translation
|
|
|
|
In: ISSN: 1574-020X ; EISSN: 1574-0218 ; Language Resources and Evaluation ; https://hal.inria.fr/hal-03021633 ; Language Resources and Evaluation, Springer Verlag, 2020, ⟨10.1007/s10579-020-09514-4⟩ (2020)
|
|
BASE
|
|
Show details
|
|
6 |
Automatic Removal of Identifying Information in Official EU Languages for Public Administrations: The MAPA Project
|
|
|
|
In: Legal Knowledge and Information Systems ; Frontiers in Artificial Intelligence and Applications ; International Conference on Legal Knowledge and Information Systems ; https://hal.archives-ouvertes.fr/hal-03058311 ; International Conference on Legal Knowledge and Information Systems, Dec 2020, Brno, Prague, Czech Republic. pp.223-226, ⟨10.3233/FAIA200869⟩ ; http://ebooks.iospress.nl/volume/legal-knowledge-and-information-systems-jurix-2020-the-thirty-third-annual-conference-brno-czech-republic-december-911-2020 (2020)
|
|
BASE
|
|
Show details
|
|
7 |
CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters
|
|
|
|
In: International Conference on Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-03100665 ; International Conference on Computational Linguistics, Dec 2020, Barcelona (on line), Spain. pp.6903-6915 ; https://coling2020.org/ (2020)
|
|
BASE
|
|
Show details
|
|
8 |
Embedding Strategies for Specialized Domains: Application to Clinical Entity Recognition
|
|
|
|
In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop ; https://hal.archives-ouvertes.fr/hal-02860947 ; Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, Jul 2019, Florence, France. pp.295-301, ⟨10.18653/v1/P19-2041⟩ (2019)
|
|
BASE
|
|
Show details
|
|
9 |
Detecting context-dependent sentences in parallel corpora ; Détection dans des corpus parallèles de phrases dépendantes du contexte
|
|
|
|
In: 25th Conférence sur le Traitement Automatique des Langues Naturelles ; https://hal.archives-ouvertes.fr/hal-01800736 ; 25th Conférence sur le Traitement Automatique des Langues Naturelles, 2018, Rennes, France. pp.393-400 (2018)
|
|
BASE
|
|
Show details
|
|
10 |
CLEF eHealth 2017 Multilingual Information Extraction task Overview: ICD10 Coding of Death Certificates in English and French.
|
|
|
|
In: Workshop of the Cross-Language Evaluation Forum ; https://hal.archives-ouvertes.fr/hal-01665374 ; Workshop of the Cross-Language Evaluation Forum, CEUR-WS, Jan 2017, Dublin, Ireland (2017)
|
|
BASE
|
|
Show details
|
|
11 |
LIMSI@WMT16: Machine Translation of News
|
|
|
|
In: First Conference on Machine Translation ; https://hal.archives-ouvertes.fr/hal-01388659 ; First Conference on Machine Translation, Aug 2016, Berlin, Germany. pp.239--245, ⟨10.18653/v1/W16-2304⟩ ; https://statmt.org/wmt16 (2016)
|
|
BASE
|
|
Show details
|
|
12 |
Two-Step MT: Predicting Target Morphology
|
|
|
|
In: International Workshop on Spoken Language Translation ; https://hal.archives-ouvertes.fr/hal-01592337 ; International Workshop on Spoken Language Translation, 2016, Seattle, WA, United States (2016)
|
|
BASE
|
|
Show details
|
|
16 |
Oublier ce qu'on sait, pour mieux apprendre ce qu'on ne sait pas : une étude sur les contraintes de type dans les modèles CRF
|
|
|
|
In: Actes de la 22e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs ; Conférence sur le Traitement Automatique des Langues Naturelles ; https://hal.archives-ouvertes.fr/hal-01634997 ; Conférence sur le Traitement Automatique des Langues Naturelles, ATALA, 2015, Caen, France (2015)
|
|
BASE
|
|
Show details
|
|
17 |
LIMSI$@$WMT'15 : Translation Task
|
|
|
|
In: Proceedings of the Tenth Workshop on Statistical Machine Translation ; https://hal.archives-ouvertes.fr/hal-02912383 ; Proceedings of the Tenth Workshop on Statistical Machine Translation, Sep 2015, Lisbon, Portugal. pp.145-151, ⟨10.18653/v1/W15-3016⟩ (2015)
|
|
BASE
|
|
Show details
|
|
18 |
Automatic language identity tagging on word and sentence-level in multilingual text sources: a case-study on Luxembourgish
|
|
|
|
In: International Conference on Language Resources and Evaluation ; https://hal.archives-ouvertes.fr/hal-01843401 ; International Conference on Language Resources and Evaluation, May 2014, Reykjavik, Iceland (2014)
|
|
Abstract:
International audience ; Luxembourgish, embedded in a multilingual context on the divide between Romance and Germanic cultures, remains one of Europe's under-described languages. This is due to the fact that the written production remains relatively low, and linguistic knowledge and resources, such as lexica and pronunciation dictionaries, are sparse. The speakers or writers will frequently switch between Luxembourgish, German, and French, on a per-sentence basis, as well as on a sub-sentence level. In order to build resources like lexicons, and especially pronunciation lexicons, or language models needed for natural language processing tasks such as automatic speech recognition, language used in text corpora should be identified. In this paper, we present the design of a manually annotated corpus of mixed language sentences as well as the tools used to select these sentences. This corpus of difficult sentences was used to test a word-based language identification system. This language identification system was used to select textual data extracted from the web, in order to build a lexicon and language models. This lexicon and language model were used in an Automatic Speech Recognition system for the Luxembourgish language which obtain a 25% WER on the Quaero development data.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO]Computer Science [cs]; corpus of Luxembourguish; language identification; under-resourced language
|
|
URL: https://hal.archives-ouvertes.fr/hal-01843401
|
|
BASE
|
|
Hide details
|
|
19 |
Automatic Language Identity Tagging on Word and Sentence-Level in Multilingual Text Sources: a Case-Study on Luxembourgish
|
|
|
|
In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14) ; Ninth International Conference on Language Resources and Evaluation (LREC'14) ; https://hal.archives-ouvertes.fr/hal-01134776 ; Ninth International Conference on Language Resources and Evaluation (LREC'14), European Language Resources Association (ELRA), May 2014, Reykjavik, Iceland. pp.3300-3304 ; http://lrec2014.lrec-conf.org/en/ (2014)
|
|
BASE
|
|
Show details
|
|
20 |
A First LVCSR System for Luxembourgish, a Low-Resourced European Language
|
|
|
|
In: Human Language Technology Challenges for Computer Science and Linguistics ; https://hal.archives-ouvertes.fr/hal-01135103 ; Zygmunt Vetulani; Joseph Mariani. Human Language Technology Challenges for Computer Science and Linguistics, 8387, Springer International Publishing, pp.479-490, 2014, 5th Language and Technology Conference, LTC 2011, Poznań, Poland, November 25--27, 2011, Revised Selected Papers, 978-3-319-08957-7. ⟨10.1007/978-3-319-08958-4_39⟩ (2014)
|
|
BASE
|
|
Show details
|
|
|
|