Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 14 of 14

1	On Efficiently Acquiring Annotations for Multilingual Models ...
	Moniz, Joel Ruben Antony; Patra, Barun; Gormley, Matthew R.. - : arXiv, 2022
	BASE
	Show details

2	Comparative Error Analysis in Neural and Finite-state Models for Unsupervised Character-level Transduction ...
	Ryskina, Maria; Hovy, Eduard; Berg-Kirkpatrick, Taylor. - : arXiv, 2021
	BASE
	Show details

3	Comparative Error Analysis in Neural and Finite-state Models for Unsupervised Character-level Transduction ...
	The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing 2021; Berg-Kirkpatrick, Taylor; Gormley, Matthew. - : Underline Science Inc., 2021
	BASE
	Show details

4	Phonetic and Visual Priors for Decipherment of Informal Romanization ...
	Ryskina, Maria; Gormley, Matthew R.; Berg-Kirkpatrick, Taylor. - : arXiv, 2020
	BASE
	Show details

5	Bilingual Lexicon Induction with Semi-supervision in Non-Isometric Embedding Spaces ...
	Patra, Barun; Moniz, Joel Ruben Antony; Garg, Sarthak. - : arXiv, 2019
	BASE
	Show details

6	Concretely Annotated English Gigaword
	Ferraro, Francis; Thomas, Max; Gormley, Matthew R.. - : Linguistic Data Consortium, 2018. : https://www.ldc.upenn.edu, 2018
	BASE
	Show details

7	Concretely Annotated New York Times
	Ferraro, Francis; Thomas, Max; Wolfe, Travis. - : Linguistic Data Consortium, 2018. : https://www.ldc.upenn.edu, 2018
	BASE
	Show details

8	Neural Factor Graph Models for Cross-lingual Morphological Tagging ...
	Malaviya, Chaitanya; Gormley, Matthew R.; Neubig, Graham. - : arXiv, 2018
	BASE
	Show details

9	Concretely Annotated New York Times ...
	Ferraro, Francis; Thomas, Max; Wolfe, Travis. - : Linguistic Data Consortium, 2018
	BASE
	Show details

10	Concretely Annotated English Gigaword ...
	Ferraro, Francis; Thomas, Max; Gormley, Matthew R.. - : Linguistic Data Consortium, 2018
	BASE
	Show details

11	Embedding Lexical Features via Low-Rank Tensors ...
	Yu, Mo; Dredze, Mark; Arora, Raman. - : arXiv, 2016
	BASE
	Show details

12	Improved Relation Extraction with Feature-Rich Compositional Embedding Models ...
	Gormley, Matthew R.; Yu, Mo; Dredze, Mark. - : arXiv, 2015
	BASE
	Show details

13	Annotated English Gigaword
	Napoles, Courtney; Gormley, Matthew R.; Van Durme, Benjamin. - : Linguistic Data Consortium, 2012. : https://www.ldc.upenn.edu, 2012
	Abstract: Introduction Annotated English Gigaword was developed by Johns Hopkins University's Human Language Technology Center of Excellence. It adds automatically-generated syntactic and discourse structure annotation to English Gigaword Fifth Edition (LDC2011T07) and also contains an API and tools for reading the dataset's XML files. The goal of the annotation is to provide a standardized corpus for knowledge extraction and distributional semantics which enables broader involvement in large-scale knowledge-acquisition efforts by researchers. Data Annotated English Gigaword contains the nearly ten million documents (over four billion words) of the original English Gigaword Fifth Edition from seven news sources: * Agence France-Presse, English Service (afp_eng) * Associated Press Worldstream, English Service (apw_eng) * Central News Agency of Taiwan, English Service (cna_eng) * Los Angeles Times/Washington Post Newswire Service (ltw_eng) * Washington Post/Bloomberg Newswire Service (wpb_eng) * New York Times Newswire Service (nyt_eng) * Xinhua News Agency, English Service (xin_eng) The following layers of annotation were added: * Tokenized and segmented sentences * Treebank-style constituent parse trees * Syntactic dependency trees * Named entities * In-document coreference chains The annotation was performed in a three-step process: (1) the data was preprocessed and sentences selected for annotation (sentences with more than 100 tokens were excluded) (2) syntactic parses were derived and (3) the parsed output was post-processed to derive syntactic dependencies, named entities and coreference chains. Over 183 million sentences were parsed. The data is stored in a form similar to the gigaword SGML format with XML annotations containing the additional markup. The included API provides object representations for the contents of the XML files. Samples Please the link for a sample. Additional Licensing Information Any 2011 member organization that licensed English Gigaword Fifth Edition (LDC2011T07) may request a no-cost copy of Annotated English Gigaword. Any non-member organization that licensed English Gigaword Fifth Edition may request a copy of Annotated English Gigaword for a $150 fee. Please contact ldc@ldc.upenn.edu for licensing or with any additional questions. Updates None at this time.
	URL: https://catalog.ldc.upenn.edu/LDC2012T21
	BASE
	Hide details

14	Annotated English Gigaword ...
	Napoles, Courtney; Gormley, Matthew R.; Van Durme, Benjamin. - : Linguistic Data Consortium, 2012
	BASE
	Show details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern