Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5 6...29

Hits 21 – 40 of 565

21	Coreference in Universal Dependencies 0.1 (CorefUD 0.1)
	Nedoluzhko, Anna; Novák, Michal; Popel, Martin; Žabokrtský, Zdeněk; Zeman, Daniel. - : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2021
	Abstract: CorefUD is a collection of previously existing datasets annotated with coreference, which we converted into a common annotation scheme. In total, CorefUD in its current version 0.1 consists of 17 datasets for 11 languages. The datasets are enriched with automatic morphological and syntactic annotations that are fully compliant with the standards of the Universal Dependencies project. All the datasets are stored in the CoNLL-U format, with coreference- and bridging-specific information captured by attribute-value pairs located in the MISC column. The collection is divided into a public edition and a non-public (ÚFAL-internal) edition. The publicly available edition is distributed via LINDAT-CLARIAH-CZ and contains 13 datasets for 10 languages (1 dataset for Catalan, 2 for Czech, 2 for English, 1 for French, 2 for German, 1 for Hungarian, 1 for Lithuanian, 1 for Polish, 1 for Russian, and 1 for Spanish), excluding the test data. The non-public edition is available internally to ÚFAL members and contains additional 4 datasets for 2 languages (1 dataset for Dutch, and 3 for English), which we are not allowed to distribute due to their original license limitations. It also contains the test data portions for all datasets. When using any of the harmonized datasets, please get acquainted with its license (placed in the same directory as the data) and cite the original data resource too. References to original resources whose harmonized versions are contained in the public edition of CorefUD 0.1: - Catalan-AnCora: Recasens, M. and Martí, M. A. (2010). AnCora-CO: Coreferentially Annotated Corpora for Spanish and Catalan. Language Resources and Evaluation, 44(4):315–345 - Czech-PCEDT: Nedoluzhko, A., Novák, M., Cinková, S., Mikulová, M., and Mírovský, J. (2016). Coreference in Prague Czech-English Dependency Treebank. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 169–176, Portorož, Slovenia. European Language Resources Association. - Czech-PDT: Hajič, J., Bejček, E., Hlaváčová, J., Mikulová, M., Straka, M., Štěpánek, J., and Štěpánková, B. (2020). Prague Dependency Treebank - Consolidated 1.0. In Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC 2020), pages 5208–5218, Marseille, France. European Language Resources Association. - English-GUM: Zeldes, A. (2017). The GUM Corpus: Creating Multilayer Resources in the Classroom. Language Resources and Evaluation, 51(3):581–612. - English-ParCorFull: Lapshinova-Koltunski, E., Hardmeier, C., and Krielke, P. (2018). ParCorFull: a Parallel Corpus Annotated with Full Coreference. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association. - French-Democrat: Landragin, F. (2016). Description, modélisation et détection automatique des chaı̂nes de référence (DEMOCRAT). Bulletin de l’Association Française pour l’Intelligence Artificielle, (92):11–15. - German-ParCorFull: Lapshinova-Koltunski, E., Hardmeier, C., and Krielke, P. (2018). ParCorFull: a Parallel Corpus Annotated with Full Coreference. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association - German-PotsdamCC: Bourgonje, P. and Stede, M. (2020). The Potsdam Commentary Corpus 2.2: Extending annotations for shallow discourse parsing. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 1061–1066, Marseille, France. European Language Resources Association. - Hungarian-SzegedKoref: Vincze, V., Hegedűs, K., Sliz-Nagy, A., and Farkas, R. (2018). SzegedKoref: A Hungarian Coreference Corpus. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan. European Language Resources Association. - Lithuanian-LCC: Žitkus, V. and Butkienė, R. (2018). Coreference Annotation Scheme and Corpus for Lithuanian Language. In Fifth International Conference on Social Networks Analysis, Management and Security, SNAMS 2018, Valencia, Spain, October 15-18, 2018, pages 243–250. IEEE. - Polish-PCC: Ogrodniczuk, M., Glowińska, K., Kopeć, M., Savary, A., and Zawisławska, M. (2013). Polish coreference corpus. In Human Language Technology. Challenges for Computer Science and Linguistics - 6th Language and Technology Conference, LTC 2013, Poznań, Poland, December 7-9, 2013. Revised Selected Papers, volume 9561 of Lecture Notes in Computer Science, pages 215–226. Springer. - Russian-RuCor: Toldova, S., Roytberg, A., Ladygina, A. A., Vasilyeva, M. D., Azerkovich, I. L., Kurzukov,M., Sim, G., Gorshkov, D. V., Ivanova, A., Nedoluzhko, A., and Grishina, Y. (2014). Evaluating Anaphora and Coreference Resolution for Russian. In Komp’juternaja lingvistika i intellektual’nye tehnologii. Po materialam ezhegodnoj Mezhdunarodnoj konferencii Dialog, pages 681–695. - Spanish-AnCora: Recasens, M. and Martí, M. A. (2010). AnCora-CO: Coreferentially Annotated Corpora for Spanish and Catalan. Language Resources and Evaluation, 44(4):315–345 References to original resources whose harmonized versions are contained in the ÚFAL-internal edition of CorefUD 0.1: - Dutch-COREA: Hendrickx, I., Bouma, G., Coppens, F., Daelemans, W., Hoste, V., Kloosterman, G., Mineur, A.-M., Van Der Vloet, J., and Verschelde, J.-L. (2008). A coreference corpus and resolution system for Dutch. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08), Marrakech, Morocco. European Language Resources Association. - English-ARRAU: Uryupina, O., Artstein, R., Bristot, A., Cavicchio, F., Delogu, F., Rodriguez, K. J., and Poesio, M. (2020). Annotating a broad range of anaphoric phenomena, in a variety of genres: the ARRAU Corpus. Natural Language Engineering, 26(1):95–128. - English-OntoNotes: Weischedel, R., Hovy, E., Marcus, M., Palmer, M., Belvin, R., Pradhan, S., Ramshaw, L., and Xue, N. (2011). Ontonotes: A large training corpus for enhanced processing. In Handbook of Natural Language Processing and Machine Translation: DARPA Global Autonomous Language Exploitation, pages 54–63, New York. Springer-Verlag. - English-PCEDT: Nedoluzhko, A., Novák, M., Cinková, S., Mikulová, M., and Mírovský, J. (2016). Coreference in Prague Czech-English Dependency Treebank. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pages 169–176, Portorož, Slovenia. European Language Resources Association.
	Keyword: bridging relations; coreference; dependency; harmonized annotation; treebank
	URL: http://hdl.handle.net/11234/1-3510
	BASE
	Hide details

22	Annotated Corpus of Pre-Standardized Balkan Slavic Literature 1.1
	Šimko, Ivan. - : Slavic Seminary, University of Zurich, 2021
	BASE
	Show details

23	Training corpus ssj500k 2.3
	Krek, Simon; Dobrovoljc, Kaja; Erjavec, Tomaž. - : Centre for Language Resources and Technologies, University of Ljubljana, 2021
	BASE
	Show details

24	Old Catalan Morphosyntax: Developing an Annotated Corpus
	Meelen, Marieke; Pujol i Campeny, Afra
	In: Journal of Open Humanities Data; Vol 7 (2021); 30 ; 2059-481X (2021)
	BASE
	Show details

25	IT-TB_PML_analytical-tectogrammatical
	Passarotti, Marco; Testori, Marinella; González Saavedra, Berta. - : CIRCSE Research Centre, Università Cattolica del Sacro Cuore, 2021
	BASE
	Show details

26	More Data and New Tools. Advances in Parsing the Index Thomisticus Treebank ...
	Gamba, Federica; Passarotti, Marco; Ruffolo, Paolo. - : Zenodo, 2021
	BASE
	Show details

27	More Data and New Tools. Advances in Parsing the Index Thomisticus Treebank ...
	Gamba, Federica; Passarotti, Marco; Ruffolo, Paolo. - : Zenodo, 2021
	BASE
	Show details

28	Discourse Relations and Connectives in Higher Text Structure
	Polakova, Lucie; Mírovský, Jiří; Zikánová, Šárka...
	In: Dialogue & Discourse; Vol 12 No 2 (2021); 1--37 ; 2152-9620 (2021)
	BASE
	Show details

29	Overview of AMALGUM – Large Silver Quality Annotations across English Genres
	Gessler, Luke D; Peng, Siyao; Liu, Yang...
	In: Proceedings of the Society for Computation in Linguistics (2021)
	BASE
	Show details

30	ODIL Syntax : a Free Spontaneous Spoken French Treebank Annotated with Constituent Trees
	Wang, Ilaine; Pelletier, Aurore; Antoine, Jean-Yves...
	In: Language Resources and Evaluation Conference, LREC ; https://hal.archives-ouvertes.fr/hal-02523141 ; Language Resources and Evaluation Conference, LREC, May 2020, Marseille, France (2020)
	BASE
	Show details

31	Building a Universal Dependencies Treebank for Occitan
	Miletic, Aleksandra; Bras, Myriam; Vergez-Couret, Marianne...
	In: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020) ; 12th Language Resources and Evaluation Conference ; https://hal.archives-ouvertes.fr/hal-02892715 ; 12th Language Resources and Evaluation Conference, May 2020, Marseille, France. pp.2932-2939 (2020)
	BASE
	Show details

32	Les constructions à verbe "εἶναι" 'être' et participe présent: 'status quaestionis' et nouvelles propositions
	Logozzo, Felicia; Tronci, Liana
	In: Société de linguistique de Paris. Bulletin de la Société de Linguistique de Paris. - Paris ; Louvain : Peeters 115 (2020) 1, 191-239
	BLLDB
	Show details

33	Christian Fandrych (Hrsg.): Gesprochene Wissenschaftssprache. Tübingen: Stauffenburg Verlag, 2017
	Jaiser, Gerhard
	In: Informationen Deutsch als Fremdsprache. - Berlin : De Gruyter 47 (2020) 2-3, 205-208
	BLLDB
	Show details

34	Het transcriptieprotocol van het Gesproken Corpus van de Nederlandse Dialecten (GCND)
	Ghyselen, Anne-Sophie; Van Keymeulen, Jacques; Farasyn, Melissa...
	In: Belgien / Commission royale de toponymie et de dialectologie. Bulletin de la Commission Royale de Toponymie & Dialectologie. - Bruxelles 92 (2020), 83-115
	BLLDB
	Show details

35	Linguistic Analysis and Automatic Information Extraction of Semantic Relations in Arabic ; Analyse linguistique et extraction automatique de relations sémantiques des textes en arabe
	MORSI, Youcef Ihab. - : HAL CCSD, 2020
	In: https://hal.archives-ouvertes.fr/tel-03572307 ; Linguistique. Université Bourgogne Franche-Comté, 2020. Français (2020)
	BASE
	Show details

36	Universal Dependencies 2.7
	Zeman, Daniel; Nivre, Joakim; Abrams, Mitchell. - : Universal Dependencies Consortium, 2020
	BASE
	Show details

37	Universal Dependencies 2.6
	Zeman, Daniel; Nivre, Joakim; Abrams, Mitchell. - : Universal Dependencies Consortium, 2020
	BASE
	Show details

38	IWPT 2020 Shared Task Data and System Outputs
	Zeman, Daniel; Bouma, Gosse; Seddah, Djamé. - : Universal Dependencies Consortium, 2020
	BASE
	Show details

39	Annotated Corpus of Pre-Standardized Balkan Slavic Literature
	Šimko, Ivan. - : Slavic Seminary, University of Zurich, 2020
	BASE
	Show details

40	Late Latin Charter Treebank 1 (LLCT1), version 1.2 ...
	Korkiakangas, Timo. - : Zenodo, 2020
	BASE
	Show details

Page: 1 2 3 4 5 6...29

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern