3 |
Differential Evaluation: a Qualitative Analysis of Natural Language Processing System Behavior Based Upon Data Resistance to Processing
|
|
|
|
In: Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems ; EVAL4NLP, 2nd Workshop on "Evaluation & Comparison of NLP Systems", EMNLP 2021 ; https://hal.archives-ouvertes.fr/hal-03432331 ; EVAL4NLP, 2nd Workshop on "Evaluation & Comparison of NLP Systems", EMNLP 2021, Nov 2021, Punta Cana, Dominican Republic (2021)
|
|
BASE
|
|
Show details
|
|
5 |
DiaBLa: A Corpus of Bilingual Spontaneous Written Dialogues for Machine Translation
|
|
|
|
In: ISSN: 1574-020X ; EISSN: 1574-0218 ; Language Resources and Evaluation ; https://hal.inria.fr/hal-03021633 ; Language Resources and Evaluation, Springer Verlag, 2020, ⟨10.1007/s10579-020-09514-4⟩ (2020)
|
|
BASE
|
|
Show details
|
|
6 |
Automatic Removal of Identifying Information in Official EU Languages for Public Administrations: The MAPA Project
|
|
|
|
In: Legal Knowledge and Information Systems ; Frontiers in Artificial Intelligence and Applications ; International Conference on Legal Knowledge and Information Systems ; https://hal.archives-ouvertes.fr/hal-03058311 ; International Conference on Legal Knowledge and Information Systems, Dec 2020, Brno, Prague, Czech Republic. pp.223-226, ⟨10.3233/FAIA200869⟩ ; http://ebooks.iospress.nl/volume/legal-knowledge-and-information-systems-jurix-2020-the-thirty-third-annual-conference-brno-czech-republic-december-911-2020 (2020)
|
|
BASE
|
|
Show details
|
|
7 |
CharacterBERT: Reconciling ELMo and BERT for Word-Level Open-Vocabulary Representations From Characters
|
|
|
|
In: International Conference on Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-03100665 ; International Conference on Computational Linguistics, Dec 2020, Barcelona (on line), Spain. pp.6903-6915 ; https://coling2020.org/ (2020)
|
|
BASE
|
|
Show details
|
|
8 |
Embedding Strategies for Specialized Domains: Application to Clinical Entity Recognition
|
|
|
|
In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop ; https://hal.archives-ouvertes.fr/hal-02860947 ; Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, Jul 2019, Florence, France. pp.295-301, ⟨10.18653/v1/P19-2041⟩ (2019)
|
|
BASE
|
|
Show details
|
|
9 |
Detecting context-dependent sentences in parallel corpora ; Détection dans des corpus parallèles de phrases dépendantes du contexte
|
|
|
|
In: 25th Conférence sur le Traitement Automatique des Langues Naturelles ; https://hal.archives-ouvertes.fr/hal-01800736 ; 25th Conférence sur le Traitement Automatique des Langues Naturelles, 2018, Rennes, France. pp.393-400 (2018)
|
|
BASE
|
|
Show details
|
|
10 |
CLEF eHealth 2017 Multilingual Information Extraction task Overview: ICD10 Coding of Death Certificates in English and French.
|
|
|
|
In: Workshop of the Cross-Language Evaluation Forum ; https://hal.archives-ouvertes.fr/hal-01665374 ; Workshop of the Cross-Language Evaluation Forum, CEUR-WS, Jan 2017, Dublin, Ireland (2017)
|
|
BASE
|
|
Show details
|
|
11 |
LIMSI@WMT16: Machine Translation of News
|
|
|
|
In: First Conference on Machine Translation ; https://hal.archives-ouvertes.fr/hal-01388659 ; First Conference on Machine Translation, Aug 2016, Berlin, Germany. pp.239--245, ⟨10.18653/v1/W16-2304⟩ ; https://statmt.org/wmt16 (2016)
|
|
BASE
|
|
Show details
|
|
12 |
Two-Step MT: Predicting Target Morphology
|
|
|
|
In: International Workshop on Spoken Language Translation ; https://hal.archives-ouvertes.fr/hal-01592337 ; International Workshop on Spoken Language Translation, 2016, Seattle, WA, United States (2016)
|
|
BASE
|
|
Show details
|
|
16 |
Oublier ce qu'on sait, pour mieux apprendre ce qu'on ne sait pas : une étude sur les contraintes de type dans les modèles CRF
|
|
|
|
In: Actes de la 22e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs ; Conférence sur le Traitement Automatique des Langues Naturelles ; https://hal.archives-ouvertes.fr/hal-01634997 ; Conférence sur le Traitement Automatique des Langues Naturelles, ATALA, 2015, Caen, France (2015)
|
|
BASE
|
|
Show details
|
|
17 |
LIMSI$@$WMT'15 : Translation Task
|
|
|
|
In: Proceedings of the Tenth Workshop on Statistical Machine Translation ; https://hal.archives-ouvertes.fr/hal-02912383 ; Proceedings of the Tenth Workshop on Statistical Machine Translation, Sep 2015, Lisbon, Portugal. pp.145-151, ⟨10.18653/v1/W15-3016⟩ (2015)
|
|
BASE
|
|
Show details
|
|
18 |
Automatic language identity tagging on word and sentence-level in multilingual text sources: a case-study on Luxembourgish
|
|
|
|
In: International Conference on Language Resources and Evaluation ; https://hal.archives-ouvertes.fr/hal-01843401 ; International Conference on Language Resources and Evaluation, May 2014, Reykjavik, Iceland (2014)
|
|
BASE
|
|
Show details
|
|
19 |
Automatic Language Identity Tagging on Word and Sentence-Level in Multilingual Text Sources: a Case-Study on Luxembourgish
|
|
|
|
In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14) ; Ninth International Conference on Language Resources and Evaluation (LREC'14) ; https://hal.archives-ouvertes.fr/hal-01134776 ; Ninth International Conference on Language Resources and Evaluation (LREC'14), European Language Resources Association (ELRA), May 2014, Reykjavik, Iceland. pp.3300-3304 ; http://lrec2014.lrec-conf.org/en/ (2014)
|
|
BASE
|
|
Show details
|
|
20 |
A First LVCSR System for Luxembourgish, a Low-Resourced European Language
|
|
|
|
In: Human Language Technology Challenges for Computer Science and Linguistics ; https://hal.archives-ouvertes.fr/hal-01135103 ; Zygmunt Vetulani; Joseph Mariani. Human Language Technology Challenges for Computer Science and Linguistics, 8387, Springer International Publishing, pp.479-490, 2014, 5th Language and Technology Conference, LTC 2011, Poznań, Poland, November 25--27, 2011, Revised Selected Papers, 978-3-319-08957-7. ⟨10.1007/978-3-319-08958-4_39⟩ (2014)
|
|
Abstract:
International audience ; Luxembourgish is embedded in a multilingual context on the divide between Romance and Germanic cultures and remains one of Europe’s low-resourced languages. We describe our efforts in building a large vocabulary ASR system for such a “minority” language without resorting to any prior transcribed audio training data. Instead, acoustic models are derived from major European languages. Furthermore, most Luxembourgish written sources include significant parts in other languages. This poses specific challenges to Language Model estimation. Some scientific and technological issues addressed include: (i) how to build acoustic models if no labeled acoustic training data are available for the under-resourced target language? (ii) how to make use of the new system to accelerate resource production for the target language? (iii) how to build a vocabulary and a language model with multilingual written texts? (iv) how to determine the “best” phonemic inventory for ASR? First ASR results illustrate the accuracy of the various sets of monolingual and multilingual acoustic models and what these suggest concerning language typology issues.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [SHS.LANGUE]Humanities and Social Sciences/Linguistics; Acoustic modeling; Forced alignment; Germanic languages; Luxembourgish; Multilingual models; Romance languages
|
|
URL: https://doi.org/10.1007/978-3-319-08958-4_39 https://hal.archives-ouvertes.fr/hal-01135103
|
|
BASE
|
|
Hide details
|
|
|
|