Home Catalogue search

eng

Refine your search:
- Keyword:
- Creator / Publisher
- Year:
  - 2013 (2)
  - 2011 (2)
  - 2009 (2)
  - 2008 (2)
  - 2007 (2)
  - 2004 (1)
  - 2003 (1)
  - 1999 (3)
  - 1998 (1)
  - 1996 (1)
  - more
- Medium:
  - Online (22)
  - Print (17)
- Type
- BLLDB-Access:
  - free (39)
  - subject to license (0)

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2

Hits 1 – 20 of 39

1	OntoNotes Release 5.0
	Weischedel, Ralph; Palmer, Martha; Marcus, Mitchell. - : Linguistic Data Consortium, 2013. : https://www.ldc.upenn.edu, 2013
	BASE
	Show details

2	OntoNotes Release 5.0 ...
	Weischedel, Ralph; Palmer, Martha; Marcus, Mitchell. - : Linguistic Data Consortium, 2013
	BASE
	Show details

3	OntoNotes Release 4.0
	Weischedel, Ralph; Palmer, Martha; Marcus, Mitchell. - : Linguistic Data Consortium, 2011. : https://www.ldc.upenn.edu, 2011
	BASE
	Show details

4	OntoNotes Release 4.0 ...
	Weischedel, Ralph; Palmer, Martha; Marcus, Mitchell. - : Linguistic Data Consortium, 2011
	BASE
	Show details

5	OntoNotes Release 3.0
	Weischedel, Ralph; Pradhan, Sameer; Ramshaw, Lance. - : Linguistic Data Consortium, 2009. : https://www.ldc.upenn.edu, 2009
	BASE
	Show details

6	OntoNotes Release 3.0 ...
	Weischedel, Ralph; Pradhan, Sameer; Ramshaw, Lance. - : Linguistic Data Consortium, 2009
	BASE
	Show details

7	OntoNotes Release 2.0
	Weischedel, Ralph; Pradhan, Sameer; Ramshaw, Lance; Palmer, Martha; Xue, Nianwen; Marcus, Mitchell; Taylor, Ann; Greenberg, Craig; Hovy, Eduard; Belvin, Robert; Houston, Ann. - : Linguistic Data Consortium, 2008. : https://www.ldc.upenn.edu, 2008
	Abstract: Introduction The OntoNotes project is a collaborative effort between BBN Technologies, the University of Colorado, the University of Pennsylvania, and the University of Southern California's Information Sciences Institute. The goal of the project is to annotate a large corpus comprising various genres of text (news, conversational telephone speech, weblogs, use net, broadcast, talk shows) in three languages (English, Chinese, and Arabic) with structural information (syntax and predicate argument structure) and shallow semantics (word sense linked to an ontology and coreference). OntoNotes Release 2.0 is a continuation of the OntoNotes project and is supported by the Defense Advanced Research Projects Agency, GALE Program Contract No. HR0011-06-C-0022. OntoNotes Release 1.0 (LDC2007T21) contains 400k words of Chinese newswire data (from Xinhua News Agency and Sinorama Magazine) and 300k words of English newswire data (from the Wall Street Journal). OntoNotes Release 2.0 adds the following to the corpus: 274k words of Chinese broadcast news data (from China Broadcating System, China Central TV, China National Radio, China Television System and Voice of America); and 200k words of English broadcast news data (from ABC, CNN, NBC, Public Radio International and Voice of America). Natural language applications like machine translation, question answering, and summarization currently are forced to depend on impoverished text models like bags of words or n-grams, while the decisions that they are making ought to be based on the meanings of those words in context. That lack of semantics causes problems throughout the applications. Misinterpreting the meaning of an ambiguous word results in failing to extract data, incorrect alignments for translation, and ambiguous language models. Incorrect coreference resolution results in missed information (because a connection is not made) or incorrectly conflated information (due to false connections). OntoNotes builds on two time-tested resources, following the Penn Treebank for syntax and the Penn PropBank for predicate-argument structure. Its semantic representation will include word sense disambiguation for nouns and verbs, with each word sense connected to an ontology, and coreference. The current goals call for annotation of over a million words each of English and Chinese, and half a million words of Arabic over five years. The authors wish to make this resource available to the natural language research community so that decoders for these phenomena can be trained to generate the same structure in new documents. Lessons learned over the years have shown that the quality of annotation is crucial if it is going to be used for training machine learning algorithms. Taking this cue, each layer of annotation in OntoNotes will have at least 90% inter-annotator agreement. Pilot studies have shown that predicate structure, word sense, ontology linking, and coreference can all be annotated rapidly and with better than 90% consistency. Samples For an example of the data in this corpus, please examine the following samples * Chinese * English Sponsorship This work is supported in part by the Defense Advanced Research Projects Agency, GALE Program Grant No. HR0011-06-1-0003. The content of this publication does not necessarily reflect the position or policy of the Government, and no official endorsement should be inferred. The World is a co-production of Public Radio International and the British Broadcasting Corporation and is produced at WGBH Boston.
	URL: https://catalog.ldc.upenn.edu/LDC2008T04
	BASE
	Hide details

8	OntoNotes Release 2.0 ...
	Weischedel, Ralph; Pradhan, Sameer; Ramshaw, Lance. - : Linguistic Data Consortium, 2008
	BASE
	Show details

9	OntoNotes Release 1.0
	Weischedel, Ralph; Pradhan, Sameer; Ramshaw, Lance. - : Linguistic Data Consortium, 2007. : https://www.ldc.upenn.edu, 2007
	BASE
	Show details

10	OntoNotes Release 1.0 ...
	Weischedel, Ralph; Pradhan, Sameer; Ramshaw, Lance. - : Linguistic Data Consortium, 2007
	BASE
	Show details

11	Corpus linguistics : readings in a widening discipline
	Kilgarriff, Adam (Mitarb.); Fries, Charles Carpenter (Mitarb.); Francis, Gill (Mitarb.). - London [u.a.] : Continuum, 2004
	BLLDB
	UB Frankfurt Linguistik
	Show details

12	The Penn Treebank : an overview
	Taylor, Ann; Marcus, Mitchell; Santorini, Beatrice
	In: Treebanks. - Dordrecht [u.a.] : Kluwer (2003), 5-22
	BLLDB
	Show details

13	Natural language processing using very large corpora
	Wu, Dekai (Mitarb.); Church, Kenneth W. (Hrsg.); Radev, Dragomir R. (Mitarb.). - Dordrecht [u.a.] : Kluwer, 1999
	BLLDB
	UB Frankfurt Linguistik
	Show details

14	Treebank-3
	Marcus, Mitchell P.; Santorini, Beatrice; Marcinkiewicz, Mary Ann. - : Linguistic Data Consortium, 1999. : https://www.ldc.upenn.edu, 1999
	BASE
	Show details

15	Treebank-3 ...
	Marcus, Mitchell P.; Santorini, Beatrice; Mary Ann Marcinkiewicz. - : Linguistic Data Consortium, 1999
	BASE
	Show details

16	Automatic Construction of Chinese-English Translation Lexicons
	Melamed, I. Dan; Marcus, Mitchell
	In: Departmental Papers (CIS) (1998)
	BASE
	Show details

17	Exploring the nature of transformation-based learning
	Ramshaw, Lance A.; Marcus, Mitchell P.
	In: The balancing act (London, 1996), p. 135-156
	MPI für Psycholinguistik
	Show details

18	A theory of syntactic recognition for natural language
	Marcus, Mitchell P.
	In: Cognitive science ; 3. - Aldershot : Elgar (1995), 133-167
	BLLDB
	Show details

19	Treebank-2
	Marcus, Mitchell P.; Santorini, Beatrice; Marcinkiewicz, Mary Ann. - : Linguistic Data Consortium, 1995. : https://www.ldc.upenn.edu, 1995
	BASE
	Show details

20	Treebank-2 ...
	Marcus, Mitchell P.; Santorini, Beatrice; Mary Ann Marcinkiewicz. - : Linguistic Data Consortium, 1995
	BASE
	Show details

Page: 1 2

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern