DE eng

Search in the Catalogues and Directories

Page: 1 2
Hits 1 – 20 of 33

1
Between History and Natural Language Processing: Study, Enrichment and Online Publication of French Parliamentary Debates of the Early Third Republic (1881-1899)
In: ParlaCLARIN III at LREC2022 - Workshop on Creating, Enriching and Using Parliamentary Corpora ; https://hal.archives-ouvertes.fr/hal-03623351 ; ParlaCLARIN III at LREC2022 - Workshop on Creating, Enriching and Using Parliamentary Corpora, Jun 2022, Marseille, France ; https://www.clarin.eu/ParlaCLARIN-III (2022)
Abstract: International audience ; We present the AGODA (Analyse sémantique et Graphes relationnels pour l'Ouverture des Débats à l'Assemblée nationale) project, which aims to create a platform for consulting and exploring digitised French parliamentary debates (1881-1940) available in the digital library of the National Library of France. This project brings together historians and NLP specialists: parliamentary debates are indeed an essential source for French history of the contemporary period, but also for linguistics. This project therefore aims to produce a corpus of texts that can be easily exploited with computational methods, and that respect the TEI standard. Ancient parliamentary debates are also an excellent case study for the development and application of tools for publishing and exploring large historical corpora. In this paper, we present the steps necessary to produce such a corpus. We detail the processing and publication chain of these documents, in particular by mentioning the problems linked to the extraction of texts from digitised images. We also introduce the first analyses that we have carried out on this corpus with "bag-of-words" techniques not too sensitive to OCR quality (namely topic modelling and word embedding).
Keyword: [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO.INFO-CY]Computer Science [cs]/Computers and Society [cs.CY]; [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]; [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing; [SHS.HIST]Humanities and Social Sciences/History; France; OCR; Parliamentary debates; Third Republic; Topic modelling; Word embedding; XML-TEI
URL: https://hal.archives-ouvertes.fr/hal-03623351/document
https://hal.archives-ouvertes.fr/hal-03623351
https://hal.archives-ouvertes.fr/hal-03623351/file/puren_bourgeois_pellet_vernus_agoda2022.pdf
BASE
Hide details
2
ASR training dataset for Croatian ParlaSpeech-HR v1.0
Ljubešić, Nikola; Koržinek, Danijel; Rupnik, Peter. - : Jožef Stefan Institute, 2022
BASE
Show details
3
Creating and analyzing multilingual parliamentary corpora ; Creating and analyzing multilingual parliamentary corpora: Research Data Management Workflows Volume 1
In: https://halshs.archives-ouvertes.fr/halshs-03366486 ; 2021 (2021)
BASE
Show details
4
Building, Encoding, and Annotating a Corpus of Parliamentary Debates in XML-TEI: A Cross-Linguistic Account
In: ISSN: 2162-5603 ; EISSN: 2162-5603 ; Journal of the Text Encoding Initiative ; https://halshs.archives-ouvertes.fr/halshs-03097333 ; Journal of the Text Encoding Initiative, TEI Consortium, 2021 (2021)
BASE
Show details
5
Multilingual comparable corpora of parliamentary debates ParlaMint 2.1
BASE
Show details
6
Linguistically annotated multilingual comparable corpora of parliamentary debates ParlaMint.ana 2.1
BASE
Show details
7
Linguistically annotated multilingual comparable corpora of parliamentary debates ParlaMint.ana 2.0
BASE
Show details
8
Multilingual comparable corpora of parliamentary debates ParlaMint 2.0
BASE
Show details
9
Drivers of English Syntactic Change in the Canadian Parliament
In: Proceedings of the Society for Computation in Linguistics (2021)
BASE
Show details
10
The Emigration debate in the Dublin press of the 1820s ; Le débat sur l'émigration dans la presse dublinoise des années 1820
Mcnamara, Michelle. - : HAL CCSD, 2020
In: https://tel.archives-ouvertes.fr/tel-03506290 ; Linguistics. Université de Strasbourg, 2020. English. ⟨NNT : 2020STRAC016⟩ (2020)
BASE
Show details
11
Building, Encoding, and Annotating a Corpus of Parliamentary Debates in XML-TEI: A Cross-Linguistic Account
In: https://halshs.archives-ouvertes.fr/halshs-03097333 ; 2020 (2020)
BASE
Show details
12
Beyond the boundaries. Migration discourse in EU parliamentary debates ...
Giordano, Michela. - : University of Salento, 2020
BASE
Show details
13
Multilingual comparable corpora of parliamentary debates ParlaMint 1.0
BASE
Show details
14
Slovenian parliamentary corpus (1990-2018) siParl 2.0
Pančur, Andrej; Erjavec, Tomaž; Ojsteršek, Mihael. - : Institute of Contemporary History, 2020
BASE
Show details
15
AustroParl Corpus of Parliamentary Debates ...
BASE
Show details
16
AustroParl Corpus of Parliamentary Debates ...
BASE
Show details
17
Beyond the boundaries. Migration discourse in EU parliamentary debates
In: Lingue e Linguaggi; Volume 39 (2020); 131-156 (2020)
BASE
Show details
18
Slovenian parliamentary corpus siParl 1.0 (1990-2018)
Pančur, Andrej; Erjavec, Tomaž; Ojsteršek, Mihael. - : Institute of Contemporary History, 2019
BASE
Show details
19
Slovenian parliamentary corpus ParlaMeter-sl 1.0
Dobranić, Filip; Ljubešić, Nikola; Erjavec, Tomaž. - : Jožef Stefan Institute, 2019
BASE
Show details
20
Croatian parliamentary corpus ParlaMeter-hr 1.0
Dobranić, Filip; Ljubešić, Nikola; Erjavec, Tomaž. - : Jožef Stefan Institute, 2019
BASE
Show details

Page: 1 2

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
33
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern