1 |
The ParlaMint corpora of parliamentary proceedings
|
|
Erjavec, Tomaž; Ogrodniczuk, Maciej; Osenova, Petya; Ljubešić, Nikola; Simov, Kiril; Pančur, Andrej; Rudolf, Michał; Kopp, Matyáš; Barkarson, Starkaður; Steingrímsson, Steinþór; Çöltekin, Çağrı; de Does, Jesse; Depuydt, Katrien; Agnoloni, Tommaso; Venturi, Giulia; Pérez, María Calzada; de Macedo, Luciana D.; Navarretta, Costanza; Luxardo, Giancarlo; Coole, Matthew; Rayson, Paul; Morkevičius, Vaidas; Krilavičius, Tomas; Darǵis, Roberts; Ring, Orsolya; van Heusden, Ruben; Marx, Maarten; Fišer, Darja. - 2022
|
|
Abstract:
This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17 European national parliaments with half a billion words. The corpora are uniformly encoded, contain rich meta-data about 11 thousand speakers, and are linguistically annotated following the Universal Dependencies formalism and with named entities. Samples of the corpora and conversion scripts are available from the project’s GitHub repository, and the complete corpora are openly available via the CLARIN.SI repository for download, as well as through the NoSketch Engine and KonText concordancers and the Parlameter interface for on-line exploration and analysis.
|
|
URL: https://doi.org/10.1007/s10579-021-09574-0 https://eprints.lancs.ac.uk/id/eprint/165473/1/s10579_021_09574_0.pdf https://eprints.lancs.ac.uk/id/eprint/165473/
|
|
BASE
|
|
Hide details
|
|
2 |
The ParlaMint corpora of parliamentary proceedings
|
|
|
|
In: Lang Resour Eval (2022)
|
|
BASE
|
|
Show details
|
|
6 |
Multilingual comparable corpora of parliamentary debates ParlaMint 2.1
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Linguistically annotated multilingual comparable corpora of parliamentary debates ParlaMint.ana 2.1
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Linguistically annotated multilingual comparable corpora of parliamentary debates ParlaMint.ana 2.0
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Multilingual comparable corpora of parliamentary debates ParlaMint 2.0
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Feature-Rich Named Entity Recognition for Bulgarian Using Conditional Random Fields ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Formae reformandae: for a reorganisation of verb form annotation in Universal Dependencies illustrated by the specific case of Latin
|
|
Cecchini, Flavio Massimiliano (orcid:0000-0001-9029-1822). - : Association for Computational Linguistics, 2021. : country:BGR, 2021. : place:Sofia, 2021
|
|
BASE
|
|
Show details
|
|
16 |
Multilingual comparable corpora of parliamentary debates ParlaMint 1.0
|
|
|
|
BASE
|
|
Show details
|
|
17 |
The CLASSLA-StanfordNLP model for UD dependency parsing of standard Bulgarian 1.0
|
|
|
|
BASE
|
|
Show details
|
|
18 |
The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Bulgarian 1.0
|
|
|
|
BASE
|
|
Show details
|
|
19 |
The CLASSLA-StanfordNLP model for lemmatisation of standard Bulgarian 1.1
|
|
|
|
BASE
|
|
Show details
|
|
20 |
The CLASSLA-StanfordNLP model for lemmatisation of standard Bulgarian 1.0
|
|
|
|
BASE
|
|
Show details
|
|
|
|