2 |
Automatic Normalisation of Early Modern French
|
|
|
|
In: https://hal.inria.fr/hal-03540226 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
3 |
[Rezension zu:] Grund und Grenze des Verstehens : Theologie und Hermeneutik im Anschluss an Friedrich Schleiermacher / Florian Priesemuth. - Berlin: De Gruyter, 2020 (Schleiermacher-Archiv ; 32)
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Aus der Editionswerkstatt: Schleiermachers Praktische Theologie – Frerichs’ Ausgabe in ihre Quellen zerlegt
|
|
|
|
BASE
|
|
Show details
|
|
5 |
From FreEM to D'AlemBERT ; From FreEM to D'AlemBERT: a Large Corpus and a Language Model for Early Modern French
|
|
|
|
In: Proceedings of the 13th Language Resources and Evaluation Conference ; https://hal.inria.fr/hal-03596653 ; Proceedings of the 13th Language Resources and Evaluation Conference, European Language Resources Association, Jun 2022, Marseille, France (2022)
|
|
Abstract:
8 pages, 2 figures, 4 tables ; International audience ; Language models for historical states of language are becoming increasingly important to allow the optimal digitisation and analysis of old textual sources. Because these historical states are at the same time more complex to process and more scarce in the corpora available, specific efforts are necessary to train natural language processing (NLP) tools adapted to the data. In this paper, we present our efforts to develop NLP tools for Early Modern French (historical French from the 16th to the 18th centuries). We present the FreEMmax corpus of Early Modern French and D'AlemBERT, a RoBERTa-based language model trained on FreEMmax. We evaluate the usefulness of D'AlemBERT by fine-tuning it on a part-of-speech tagging task, outperforming previous work on the test set. Importantly, we find evidence for the transfer learning capacity of the language model, since its performance on lesser-resourced time periods appears to have been boosted by the more resourced ones. We release D'AlemBERT and the open-sourced subpart of the FreEMmax corpus.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; Corpus creation; Création de corpus; Digital humanities; Early Modern French; Français classique; Humanités Numériques; Language modelling; Langues peu dotées; Less-resourced languages; Modèle de langue neuronal; Modélisation linguistique; Neural language representation models; Partie du discours; POS tagging
|
|
URL: https://hal.inria.fr/hal-03596653
|
|
BASE
|
|
Hide details
|
|
6 |
Etude de cas de pathologies de la parole dans le cadre de la prise en charge orthophonique
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03568182 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
7 |
The sociative/benefactive applicative construction and the introduction of attitude holders in Tibetan
|
|
|
|
In: Applicative Morphology: Neglected Syntactic and Non-Syntactic Functions ; https://hal.archives-ouvertes.fr/hal-03610908 ; Applicative Morphology: Neglected Syntactic and Non-Syntactic Functions, In press (2022)
|
|
BASE
|
|
Show details
|
|
8 |
Learning and controlling the source-filter representation of speech with a variational autoencoder
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03650569 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
9 |
Introducing the HIPE 2022 Shared Task: Named Entity Recognition and Linking in Multilingual Historical Documents
|
|
|
|
In: Advances in Information Retrieval. 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10–14, 2022, Proceedings, Part II ; https://hal.archives-ouvertes.fr/hal-03635971 ; Matthias Hagen; Suzan Verberne; Craig Macdonald; Christin Seifert; Krisztian Balog; Kjetil Nørvåg; Vinay Setty. Advances in Information Retrieval. 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10–14, 2022, Proceedings, Part II, 13186, Springer International Publishing, pp.347-354, 2022, Lecture Notes in Computer Science, 978-3-030-99738-0. ⟨10.1007/978-3-030-99739-7_44⟩ (2022)
|
|
BASE
|
|
Show details
|
|
10 |
Le tibétain parlé : exercices pratiques (vol. 1)
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03610669 ; A paraître (2022)
|
|
BASE
|
|
Show details
|
|
11 |
The grammaticalization of plurality in the languages of Amdo
|
|
|
|
In: Himalayan Linguistics ; https://hal.archives-ouvertes.fr/hal-03610576 ; Himalayan Linguistics, 2022, 20 (3), ⟨10.5070/H920353650⟩ (2022)
|
|
BASE
|
|
Show details
|
|
12 |
Replication data and results for the publication "The dirty work of boundary maintenance" ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Grenzüberschreitendes Textmining von Historischen Zeitungen - Das impresso-Projekt zwischen Text- und Bildverarbeitung, Design und Geschichtswissenschaft ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Grenzüberschreitendes Textmining von Historischen Zeitungen - Das impresso-Projekt zwischen Text- und Bildverarbeitung, Design und Geschichtswissenschaft ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Consent to Research in Madagascar: Challenges, Strategies, and Priorities for Future Research
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Utilising a systematic review-based approach to create a database of individual participant data for meta- and network meta-analyses: The RELEASE database of aphasia after stroke
|
|
|
|
In: Research outputs 2014 to 2021 (2022)
|
|
BASE
|
|
Show details
|
|
|
|