DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5...16
Hits 1 – 20 of 310

1
Between History and Natural Language Processing: Study, Enrichment and Online Publication of French Parliamentary Debates of the Early Third Republic (1881-1899)
In: ParlaCLARIN III at LREC2022 - Workshop on Creating, Enriching and Using Parliamentary Corpora ; https://hal.archives-ouvertes.fr/hal-03623351 ; ParlaCLARIN III at LREC2022 - Workshop on Creating, Enriching and Using Parliamentary Corpora, Jun 2022, Marseille, France ; https://www.clarin.eu/ParlaCLARIN-III (2022)
Abstract: International audience ; We present the AGODA (Analyse sémantique et Graphes relationnels pour l'Ouverture des Débats à l'Assemblée nationale) project, which aims to create a platform for consulting and exploring digitised French parliamentary debates (1881-1940) available in the digital library of the National Library of France. This project brings together historians and NLP specialists: parliamentary debates are indeed an essential source for French history of the contemporary period, but also for linguistics. This project therefore aims to produce a corpus of texts that can be easily exploited with computational methods, and that respect the TEI standard. Ancient parliamentary debates are also an excellent case study for the development and application of tools for publishing and exploring large historical corpora. In this paper, we present the steps necessary to produce such a corpus. We detail the processing and publication chain of these documents, in particular by mentioning the problems linked to the extraction of texts from digitised images. We also introduce the first analyses that we have carried out on this corpus with "bag-of-words" techniques not too sensitive to OCR quality (namely topic modelling and word embedding).
Keyword: [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO.INFO-CY]Computer Science [cs]/Computers and Society [cs.CY]; [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]; [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing; [SHS.HIST]Humanities and Social Sciences/History; France; OCR; Parliamentary debates; Third Republic; Topic modelling; Word embedding; XML-TEI
URL: https://hal.archives-ouvertes.fr/hal-03623351/document
https://hal.archives-ouvertes.fr/hal-03623351
https://hal.archives-ouvertes.fr/hal-03623351/file/puren_bourgeois_pellet_vernus_agoda2022.pdf
BASE
Hide details
2
Considerations for Multilingual Wikipedia Research ...
Johnson, Isaac; Lescak, Emily. - : arXiv, 2022
BASE
Show details
3
Cross-Lingual Query-Based Summarization of Crisis-Related Social Media: An Abstractive Approach Using Transformers ...
Vitiugin, Fedor; Castillo, Carlos. - : arXiv, 2022
BASE
Show details
4
MMTAfrica: Multilingual Machine Translation for African Languages ...
BASE
Show details
5
A New Generation of Perspective API: Efficient Multilingual Character-level Transformers ...
Lees, Alyssa; Tran, Vinh Q.; Tay, Yi. - : arXiv, 2022
BASE
Show details
6
MuMiN: A Large-Scale Multilingual Multimodal Fact-Checked Misinformation Social Network Dataset ...
BASE
Show details
7
Korean Online Hate Speech Dataset for Multilabel Classification: How Can Social Science Improve Dataset on Hate Speech? ...
BASE
Show details
8
An NLP Solution to Foster the Use of Information in Electronic Health Records for Efficiency in Decision-Making in Hospital Care ...
BASE
Show details
9
Networks and Identity Drive Geographic Properties of the Diffusion of Linguistic Innovation ...
BASE
Show details
10
Using Pre-Trained Language Models for Producing Counter Narratives Against Hate Speech: a Comparative Study ...
BASE
Show details
11
Cyberbullying Classifiers are Sensitive to Model-Agnostic Perturbations ...
BASE
Show details
12
Achieving Downstream Fairness with Geometric Repair ...
BASE
Show details
13
Towards Responsible Natural Language Annotation for the Varieties of Arabic ...
Bergman, A. Stevie; Diab, Mona T.. - : arXiv, 2022
BASE
Show details
14
Polling Latent Opinions: A Method for Computational Sociolinguistics Using Transformer Language Models ...
BASE
Show details
15
Who will share Fake-News on Twitter? Psycholinguistic cues in online post histories discriminate Between actors in the misinformation ecosystem ...
BASE
Show details
16
A Psycho-linguistic Analysis of BitChute ...
Horne, Benjamin D.. - : arXiv, 2022
BASE
Show details
17
How Hermeneutic Spirals may reduce Complexity to Narrative Schemata - expanding on "Complexity and the Userly Text"
In: https://hal.archives-ouvertes.fr/hal-03254233 ; 2021 (2021)
BASE
Show details
18
Digital participation of left-wing activists in Brazil: cultural events as a cement to mobilization and networked protest
In: Brasiliana: Journal for Brazilian Studies ; https://hal.archives-ouvertes.fr/hal-03365831 ; Brasiliana: Journal for Brazilian Studies, 2021, 10 (1), pp.261-284. ⟨10.25160/bjbs.v10i1.125719⟩ (2021)
BASE
Show details
19
Influencer detection in social media ; Détection des influenceurs dans des médias sociaux
Deturck, Kévin. - : HAL CCSD, 2021
In: https://tel.archives-ouvertes.fr/tel-03640442 ; Ordinateur et société [cs.CY]. Institut National des Langues et Civilisations Orientales- INALCO PARIS - LANGUES O', 2021. Français. ⟨NNT : 2021INAL0034⟩ (2021)
BASE
Show details
20
L’intelligence artificielle au risque du singulier ; L’intelligence artificielle au risque du singulier: Les limites du calcul des significations dans les technologies de la traduction
In: Qu'est-ce qui échappe à l'intelligence artificielle ? Colloque interdisciplinaire ; https://hal-utt.archives-ouvertes.fr/hal-03358046 ; Qu'est-ce qui échappe à l'intelligence artificielle ? Colloque interdisciplinaire, Laboratoire de sciences humaines de Polytechnique, Sep 2021, Palaiseau, France (2021)
BASE
Show details

Page: 1 2 3 4 5...16

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
310
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern