DE eng

Search in the Catalogues and Directories

Hits 1 – 4 of 4

1
Collecting and annotating corpora for three under-resourced languages of France: Methodological issues
In: ISSN: 1934-5275 ; EISSN: 1934-5275 ; Language Documentation & Conservation ; https://hal.archives-ouvertes.fr/hal-03273196 ; Language Documentation & Conservation, University of Hawaiʻi Press 2021, 15, pp.316-357 ; http://hdl.handle.net/10125/74645 (2021)
BASE
Show details
2
Collecting and annotating corpora for three under-resourced languages of France: Methodological issues
Bernhard, Delphine; Ligozat, Anne-Laure; Bras, Myriam. - : University of Hawaii Press, 2021
BASE
Show details
3
Collecting and annotating corpora for three under-resourced languages of France: Methodological issues
Abstract: In contrast to French, the vast majority of regional languages of France can be considered as under-resourced. In this article, we present the results of a research project aiming to produce annotated resources for three regional languages of France: Alsatian, Occitan, and Picard. These languages cover three different language families (Germanic and two subfamilies of Romance, Oïl and Oc languages) and different sociolinguistic situations. Yet, they all face issues common to many under-resourced languages: lack of human and financial resources and presence of geolinguistic variation. The originality of this project is that it brought together researchers from different fields (sociolinguistics, descriptive linguistics, dialectology, natural language processing, digital humanities) to work together towards the common goal of developing annotated corpora for Alsatian, Occitan, and Picard. This created a favorable and stimulating working environment which could not have been achieved had different research groups worked independently, each on a single language. This article details the annotation process, with a special focus on the delimitation of the tokens and the definition of the part-of-speech tags. ; National Foreign Language Resource Center
Keyword: Alsatian; annotations; corpus; Occitan; part-of-speech; Picard; tokenization
URL: http://hdl.handle.net/10125/74645
BASE
Hide details
4
L’avenir numérique des langues minoritaires : bilan du projet RESTAURE pour l’alsacien, l’occitan et le picard
In: ISSN: 2105-0368 ; Les Cahiers du GEPE ; Colloque « Langues minoritaires » : quels acteurs pour quel avenir ? ; https://hal.archives-ouvertes.fr/hal-02378172 ; Les Cahiers du GEPE, Université de Strasbourg, 2020, Langues minoritaires : Quels acteurs pour quel avenir ? ; http://cahiersdugepe.fr/index.php?id=3662 (2020)
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
4
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern