DE eng

Search in the Catalogues and Directories

Hits 1 – 2 of 2

1
BLLIP 1987-89 WSJ Corpus Release 1
Charniak, Eugene; Blaheta, Don; Ge, Niyu; Hall, Keith; Hale, John; Johnson, Mark. - : Linguistic Data Consortium, 2000. : https://www.ldc.upenn.edu, 2000
Abstract: *Introduction* Brown Laboratory for Linguistic Information Processing (BLLIP)1987-89 WSJ Corpus Release 1 contains a complete, Treebank-style part-of-speech (POS) tagged and parsed version of the three-year Wall Street Journal (WSJ) collection from ACL/DCI (LDC93T1), approximately 30 million words. The annotation was performed using statistically-based methods developed by BLIIP researchers Eugene Charniak, Don Blaheta, Niyu Ge, Keith Hall, John Hale and Mark Johnson. This corpus both overlaps and supplements the million-word Penn Treebank (PTB) collection of parsed and POS-tagged WSJ texts. *Data* The PTB project selected 2,499 stories from a three-year WSJ collection of 98,732 stories for syntactic annotation. These 2,499 stories are distributed in Treebank-2 (LDC95T7) and Treebank-3 (LDC99T42), both of which include the raw text for each story. *Updates* There are no updates at this time.
URL: https://catalog.ldc.upenn.edu/LDC2000T43
BASE
Hide details
2
BLLIP 1987-89 WSJ Corpus Release 1 ...
Charniak, Eugene; Blaheta, Don; Ge, Niyu. - : Linguistic Data Consortium, 2000
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
2
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern