1 |
RETRIEVING SPEAKER INFORMATION FROM PERSONALIZED ACOUSTIC MODELS FOR SPEECH RECOGNITION
|
|
|
|
In: IEEE ICASSP 2022 ; https://hal.archives-ouvertes.fr/hal-03539741 ; IEEE ICASSP 2022, 2022, Singapour, Singapore (2022)
|
|
BASE
|
|
Show details
|
|
2 |
From FreEM to D'AlemBERT ; From FreEM to D'AlemBERT: a Large Corpus and a Language Model for Early Modern French
|
|
|
|
In: Proceedings of the 13th Language Resources and Evaluation Conference ; https://hal.inria.fr/hal-03596653 ; Proceedings of the 13th Language Resources and Evaluation Conference, European Language Resources Association, Jun 2022, Marseille, France (2022)
|
|
Abstract:
8 pages, 2 figures, 4 tables ; International audience ; Language models for historical states of language are becoming increasingly important to allow the optimal digitisation and analysis of old textual sources. Because these historical states are at the same time more complex to process and more scarce in the corpora available, specific efforts are necessary to train natural language processing (NLP) tools adapted to the data. In this paper, we present our efforts to develop NLP tools for Early Modern French (historical French from the 16th to the 18th centuries). We present the FreEMmax corpus of Early Modern French and D'AlemBERT, a RoBERTa-based language model trained on FreEMmax. We evaluate the usefulness of D'AlemBERT by fine-tuning it on a part-of-speech tagging task, outperforming previous work on the test set. Importantly, we find evidence for the transfer learning capacity of the language model, since its performance on lesser-resourced time periods appears to have been boosted by the more resourced ones. We release D'AlemBERT and the open-sourced subpart of the FreEMmax corpus.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; Corpus creation; Création de corpus; Digital humanities; Early Modern French; Français classique; Humanités Numériques; Language modelling; Langues peu dotées; Less-resourced languages; Modèle de langue neuronal; Modélisation linguistique; Neural language representation models; Partie du discours; POS tagging
|
|
URL: https://hal.inria.fr/hal-03596653
|
|
BASE
|
|
Hide details
|
|
3 |
A gentle introduction to Girard's Transcendental Syntax for the linear logician
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-02977750 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
4 |
Learning and controlling the source-filter representation of speech with a variational autoencoder
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03650569 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
5 |
Hippocampal ensembles represent sequential relationships among an extended sequence of nonspatial events.
|
|
|
|
In: Nature communications, vol 13, iss 1 (2022)
|
|
BASE
|
|
Show details
|
|
6 |
Changes in the midst of a construction network: a diachronic construction grammar approach to complex prepositions denoting internal location
|
|
|
|
In: ISSN: 0936-5907 ; EISSN: 1613-3641 ; Cognitive Linguistics ; https://halshs.archives-ouvertes.fr/halshs-03637056 ; Cognitive Linguistics, De Gruyter, 2022, ⟨10.1515/cog-2021-0128⟩ (2022)
|
|
BASE
|
|
Show details
|
|
7 |
Changes in the midst of a construction network: a diachronic construction grammar approach to complex prepositions denoting internal location
|
|
|
|
In: ISSN: 0936-5907 ; EISSN: 1613-3641 ; Cognitive Linguistics ; https://halshs.archives-ouvertes.fr/halshs-03637056 ; Cognitive Linguistics, De Gruyter, In press, ⟨10.1515/cog-2021-0128⟩ (2022)
|
|
BASE
|
|
Show details
|
|
8 |
Le modèle Transformer: un « couteau suisse » pour le traitement automatique des langues
|
|
|
|
In: Techniques de l'Ingenieur ; https://hal.archives-ouvertes.fr/hal-03619077 ; Techniques de l'Ingenieur, Techniques de l'ingénieur, 2022, ⟨10.51257/a-v1-in195⟩ ; https://www.techniques-ingenieur.fr/base-documentaire/innovation-th10/innovations-en-electronique-et-tic-42257210/transformer-des-reseaux-de-neurones-pour-le-traitement-automatique-des-langues-in195/ (2022)
|
|
BASE
|
|
Show details
|
|
9 |
Imputing Out-of-Vocabulary Embeddings with LOVE Makes Language Models Robust with Little Cost
|
|
|
|
In: ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-03613101 ; ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, May 2022, Dublin, Ireland (2022)
|
|
BASE
|
|
Show details
|
|
10 |
Imputing out-of-vocabulary embeddings with LOVE makes language models robust with little cost
|
|
|
|
In: ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-03613101 ; ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, May 2022, Dublin, Ireland (2022)
|
|
BASE
|
|
Show details
|
|
11 |
Structured, flexible, and robust: comparing linguistic plans and explanations generated by humans and large language models ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
From bag-of-words towards natural language: adapting topic models to avoid stop word removal ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
A Collection of Classroom Instruction ... : A Collection of Classroom Instruction ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Biodiversity: how big is our global biodiversity debt and what can we do about it? ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Bayesian data analysis in the phonetic sciences: A tutorial introduction ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
How Cognitive Abilities May Support Children’s Bilingual Literacy Development in a Multilingual Society ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
On the Transferability of Pre-trained Language Models for Low-Resource Programming Languages ...
|
|
Chen, Fuxiang. - : Federated Research Data Repository / dépôt fédéré de données de recherche, 2022
|
|
BASE
|
|
Show details
|
|
|
|