DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5 6...29
Hits 21 – 40 of 565

21
Coreference in Universal Dependencies 0.1 (CorefUD 0.1)
Nedoluzhko, Anna; Novák, Michal; Popel, Martin; Žabokrtský, Zdeněk; Zeman, Daniel. - : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2021
Abstract: CorefUD is a collection of previously existing datasets annotated with coreference, which we converted into a common annotation scheme. In total, CorefUD in its current version 0.1 consists of 17 datasets for 11 languages. The datasets are enriched with automatic morphological and syntactic annotations that are fully compliant with the standards of the Universal Dependencies project. All the datasets are stored in the CoNLL-U format, with coreference- and bridging-specific information captured by attribute-value pairs located in the MISC column. The collection is divided into a public edition and a non-public (ÚFAL-internal) edition. The publicly available edition is distributed via LINDAT-CLARIAH-CZ and contains 13 datasets for 10 languages (1 dataset for Catalan, 2 for Czech, 2 for English, 1 for French, 2 for German, 1 for Hungarian, 1 for Lithuanian, 1 for Polish, 1 for Russian, and 1 for Spanish), excluding the test data. The non-public edition is available internally to ÚFAL members and contains additional 4 datasets for 2 languages (1 dataset for Dutch, and 3 for English), which we are not allowed to distribute due to their original license limitations. It also contains the test data portions for all datasets. When using any of the harmonized datasets, please get acquainted with its license (placed in the same directory as the data) and cite the original data resource too. References to original resources whose harmonized versions are contained in the public edition of CorefUD 0.1: - Catalan-AnCora: Recasens, M. and Martí, M. A. (2010). AnCora-CO: Coreferentially Annotated Corpora for Spanish and Catalan. Language Resources and Evaluation, 44(4):315–345 - Czech-PCEDT: Nedoluzhko, A., Novák, M., Cinková, S., Mikulová, M., and Mírovský, J. (2016). Coreference in Prague Czech-English Dependency Treebank. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 169–176, Portorož, Slovenia. European Language Resources Association. - Czech-PDT: Hajič, J., Bejček, E., Hlaváčová, J., Mikulová, M., Straka, M., Štěpánek, J., and Štěpánková, B. (2020). Prague Dependency Treebank - Consolidated 1.0. In Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC 2020), pages 5208–5218, Marseille, France. European Language Resources Association. - English-GUM: Zeldes, A. (2017). The GUM Corpus: Creating Multilayer Resources in the Classroom. Language Resources and Evaluation, 51(3):581–612. - English-ParCorFull: Lapshinova-Koltunski, E., Hardmeier, C., and Krielke, P. (2018). ParCorFull: a Parallel Corpus Annotated with Full Coreference. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association. - French-Democrat: Landragin, F. (2016). Description, modélisation et détection automatique des chaı̂nes de référence (DEMOCRAT). Bulletin de l’Association Française pour l’Intelligence Artificielle, (92):11–15. - German-ParCorFull: Lapshinova-Koltunski, E., Hardmeier, C., and Krielke, P. (2018). ParCorFull: a Parallel Corpus Annotated with Full Coreference. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association - German-PotsdamCC: Bourgonje, P. and Stede, M. (2020). The Potsdam Commentary Corpus 2.2: Extending annotations for shallow discourse parsing. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 1061–1066, Marseille, France. European Language Resources Association. - Hungarian-SzegedKoref: Vincze, V., Hegedűs, K., Sliz-Nagy, A., and Farkas, R. (2018). SzegedKoref: A Hungarian Coreference Corpus. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association. - Lithuanian-LCC: Žitkus, V. and Butkienė, R. (2018). Coreference Annotation Scheme and Corpus for Lithuanian Language. In Fifth International Conference on Social Networks Analysis, Management and Security, SNAMS 2018, Valencia, Spain, October 15-18, 2018, pages 243–250. IEEE. - Polish-PCC: Ogrodniczuk, M., Glowińska, K., Kopeć, M., Savary, A., and Zawisławska, M. (2013). Polish coreference corpus. In Human Language Technology. Challenges for Computer Science and Linguistics - 6th Language and Technology Conference, LTC 2013, Poznań, Poland, December 7-9, 2013. Revised Selected Papers, volume 9561 of Lecture Notes in Computer Science, pages 215–226. Springer. - Russian-RuCor: Toldova, S., Roytberg, A., Ladygina, A. A., Vasilyeva, M. D., Azerkovich, I. L., Kurzukov,M., Sim, G., Gorshkov, D. V., Ivanova, A., Nedoluzhko, A., and Grishina, Y. (2014). Evaluating Anaphora and Coreference Resolution for Russian. In Komp’juternaja lingvistika i intellektual’nye tehnologii. Po materialam ezhegodnoj Mezhdunarodnoj konferencii Dialog, pages 681–695. - Spanish-AnCora: Recasens, M. and Martí, M. A. (2010). AnCora-CO: Coreferentially Annotated Corpora for Spanish and Catalan. Language Resources and Evaluation, 44(4):315–345 References to original resources whose harmonized versions are contained in the ÚFAL-internal edition of CorefUD 0.1: - Dutch-COREA: Hendrickx, I., Bouma, G., Coppens, F., Daelemans, W., Hoste, V., Kloosterman, G., Mineur, A.-M., Van Der Vloet, J., and Verschelde, J.-L. (2008). A coreference corpus and resolution system for Dutch. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08), Marrakech, Morocco. European Language Resources Association. - English-ARRAU: Uryupina, O., Artstein, R., Bristot, A., Cavicchio, F., Delogu, F., Rodriguez, K. J., and Poesio, M. (2020). Annotating a broad range of anaphoric phenomena, in a variety of genres: the ARRAU Corpus. Natural Language Engineering, 26(1):95–128. - English-OntoNotes: Weischedel, R., Hovy, E., Marcus, M., Palmer, M., Belvin, R., Pradhan, S., Ramshaw, L., and Xue, N. (2011). Ontonotes: A large training corpus for enhanced processing. In Handbook of Natural Language Processing and Machine Translation: DARPA Global Autonomous Language Exploitation, pages 54–63, New York. Springer-Verlag. - English-PCEDT: Nedoluzhko, A., Novák, M., Cinková, S., Mikulová, M., and Mírovský, J. (2016). Coreference in Prague Czech-English Dependency Treebank. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages 169–176, Portorož, Slovenia. European Language Resources Association.
Keyword: bridging relations; coreference; dependency; harmonized annotation; treebank
URL: http://hdl.handle.net/11234/1-3510
BASE
Hide details
22
Annotated Corpus of Pre-Standardized Balkan Slavic Literature 1.1
Šimko, Ivan. - : Slavic Seminary, University of Zurich, 2021
BASE
Show details
23
Training corpus ssj500k 2.3
Krek, Simon; Dobrovoljc, Kaja; Erjavec, Tomaž. - : Centre for Language Resources and Technologies, University of Ljubljana, 2021
BASE
Show details
24
Old Catalan Morphosyntax: Developing an Annotated Corpus
In: Journal of Open Humanities Data; Vol 7 (2021); 30 ; 2059-481X (2021)
BASE
Show details
25
IT-TB_PML_analytical-tectogrammatical
Passarotti, Marco; Testori, Marinella; González Saavedra, Berta. - : CIRCSE Research Centre, Università Cattolica del Sacro Cuore, 2021
BASE
Show details
26
More Data and New Tools. Advances in Parsing the Index Thomisticus Treebank ...
BASE
Show details
27
More Data and New Tools. Advances in Parsing the Index Thomisticus Treebank ...
BASE
Show details
28
Discourse Relations and Connectives in Higher Text Structure
In: Dialogue & Discourse; Vol 12 No 2 (2021); 1--37 ; 2152-9620 (2021)
BASE
Show details
29
Overview of AMALGUM – Large Silver Quality Annotations across English Genres
In: Proceedings of the Society for Computation in Linguistics (2021)
BASE
Show details
30
ODIL Syntax : a Free Spontaneous Spoken French Treebank Annotated with Constituent Trees
In: Language Resources and Evaluation Conference, LREC ; https://hal.archives-ouvertes.fr/hal-02523141 ; Language Resources and Evaluation Conference, LREC, May 2020, Marseille, France (2020)
BASE
Show details
31
Building a Universal Dependencies Treebank for Occitan
In: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020) ; 12th Language Resources and Evaluation Conference ; https://hal.archives-ouvertes.fr/hal-02892715 ; 12th Language Resources and Evaluation Conference, May 2020, Marseille, France. pp.2932-2939 (2020)
BASE
Show details
32
Les constructions à verbe "εἶναι" 'être' et participe présent: 'status quaestionis' et nouvelles propositions
In: Société de linguistique de Paris. Bulletin de la Société de Linguistique de Paris. - Paris ; Louvain : Peeters 115 (2020) 1, 191-239
BLLDB
Show details
33
Christian Fandrych (Hrsg.): Gesprochene Wissenschaftssprache. Tübingen: Stauffenburg Verlag, 2017
In: Informationen Deutsch als Fremdsprache. - Berlin : De Gruyter 47 (2020) 2-3, 205-208
BLLDB
Show details
34
Het transcriptieprotocol van het Gesproken Corpus van de Nederlandse Dialecten (GCND)
In: Belgien / Commission royale de toponymie et de dialectologie. Bulletin de la Commission Royale de Toponymie & Dialectologie. - Bruxelles 92 (2020), 83-115
BLLDB
Show details
35
Linguistic Analysis and Automatic Information Extraction of Semantic Relations in Arabic ; Analyse linguistique et extraction automatique de relations sémantiques des textes en arabe
MORSI, Youcef Ihab. - : HAL CCSD, 2020
In: https://hal.archives-ouvertes.fr/tel-03572307 ; Linguistique. Université Bourgogne Franche-Comté, 2020. Français (2020)
BASE
Show details
36
Universal Dependencies 2.7
Zeman, Daniel; Nivre, Joakim; Abrams, Mitchell. - : Universal Dependencies Consortium, 2020
BASE
Show details
37
Universal Dependencies 2.6
Zeman, Daniel; Nivre, Joakim; Abrams, Mitchell. - : Universal Dependencies Consortium, 2020
BASE
Show details
38
IWPT 2020 Shared Task Data and System Outputs
Zeman, Daniel; Bouma, Gosse; Seddah, Djamé. - : Universal Dependencies Consortium, 2020
BASE
Show details
39
Annotated Corpus of Pre-Standardized Balkan Slavic Literature
Šimko, Ivan. - : Slavic Seminary, University of Zurich, 2020
BASE
Show details
40
Late Latin Charter Treebank 1 (LLCT1), version 1.2 ...
Korkiakangas, Timo. - : Zenodo, 2020
BASE
Show details

Page: 1 2 3 4 5 6...29

Catalogues
54
0
72
0
0
2
0
Bibliographies
312
1
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
248
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern