DE eng

Search in the Catalogues and Directories

Page: 1 2
Hits 1 – 20 of 31

1
1993-2007 United Nations Parallel Text
Franz, Alex; Kumar, Shankar; Brants, Thorsten. - : Linguistic Data Consortium, 2013. : https://www.ldc.upenn.edu, 2013
BASE
Show details
2
1993-2007 United Nations Parallel Text ...
Franz, Alex; Kumar, Shankar; Brants, Thorsten. - : Linguistic Data Consortium, 2013
BASE
Show details
3
Web 1T 5-gram, 10 European Languages Version 1
Brants, Thorsten; Franz, Alex. - : Linguistic Data Consortium, 2009. : https://www.ldc.upenn.edu, 2009
BASE
Show details
4
Web 1T 5-gram, 10 European Languages Version 1 ...
Brants, Thorsten; Franz, Alex. - : Linguistic Data Consortium, 2009
BASE
Show details
5
Tagging and parsing with cascaded Markov models : automation of corpus annotation
UB Frankfurt Linguistik
Show details
6
Statistisch basierte Sprachmodelle und maschinelle Übersetzung
In: Sprachkorpora. - Berlin [u.a.] : de Gruyter (2007), 235-248
BLLDB
Show details
7
Tagging and parsing with cascaded Markov models : automation of corpus annotation
BASE
Show details
8
Web 1T 5-gram Version 1
Brants, Thorsten; Franz, A.. - Philadelphia, PA : Linguistic Data Consortium, 2006
MPI für Psycholinguistik
Show details
9
Web 1T 5-gram, 10 European Languages Version 1
Brants, Thorsten; Franz, A.. - Philadelphia, PA : Linguistic Data Consortium, 2006
MPI für Psycholinguistik
Show details
10
Web 1T 5-gram Version 1
Brants, Thorsten; Franz, Alex. - : Linguistic Data Consortium, 2006. : https://www.ldc.upenn.edu, 2006
BASE
Show details
11
Web 1T 5-gram Version 1 ...
Brants, Thorsten; Franz, Alex. - : Linguistic Data Consortium, 2006
Abstract:

Introduction

Web 1T 5-gram Version 1 was contributed by Google Inc. and contains English word n-grams and their observed frequency counts for approximately 1 trillion tokens. The length of the n-grams ranges from unigrams (single words) to five-grams. This data is expected to be useful for statistical language modeling, e.g., for machine translation or speech recognition, as well as for other uses.

Data

The n-gram counts were generated from text taken from publicly accessible Web pages. The input encoding of documents was automatically detected, and all text was converted to UTF-8. The data was tokenized in a manner similar to the tokenization of the Wall Street Journal portion of the Penn Treebank. Notable exceptions include the following:
  • Hyphenated word are usually separated, and hyphenated numbers usually form one token.
  • Sequences of numbers separated by slashes (e.g. in dates) form one token.
  • ...
URL: https://catalog.ldc.upenn.edu/LDC2006T13
https://dx.doi.org/10.35111/cqpa-a498
BASE
Hide details
12
Syntactic annotation of a German newspaper corpus
In: Treebanks. - Dordrecht [u.a.] : Kluwer (2003), 73-87
BLLDB
Show details
13
Syntactic Annotation of a German Newspaper Corpus
In: Treebanks. Building and Using Parsed Corpora (2003), 73-88
IDS Bibliografie zur deutschen Grammatik
Show details
14
Story Link Detection and New Event Detection are Asymmetric
In: DTIC (2003)
BASE
Show details
15
The LinGO redwoods treebank: Motivation and preliminary applications
Oepen, Stephan; Brants, Thorsten; Toutanova, Kristina. - : Association for Computing Machinery, 2002
BASE
Show details
16
Wide-Coverage Probabilistic Sentence Processing
In: Journal of psycholinguistic research. - New York, NY ; London [u.a.] : Springer 29 (2000) 6, 647
OLC Linguistik
Show details
17
Wide-coverage probabilistic sentence processing
In: Journal of psycholinguistic research. - New York, NY ; London [u.a.] : Springer 29 (2000) 6, 647-669
BLLDB
Show details
18
Tagging and parsing with cascaded Markov models : automation of corpus annotation
Brants, Thorsten. - Saarbrücken : Univ., Department of Computational Lingustistics and Phonetics, 1999
BLLDB
UB Frankfurt Linguistik
Show details
19
Tagging and parsing with cascaded Markov models : automation of corpus annotation ...
Brants, Thorsten. - : Universität des Saarlandes, 1999
BASE
Show details
20
The Tbilisi symposium on logic, language and computation : 19-22 October 1995, Gudauri, Georgia. Selected papers
Ginzburg, Jonathan (Hrsg.); Amsili, Pascal (Mitarb.); Le Draoulec, Anne (Mitarb.). - Stanford, Calif. : Center for the Study of Language and Information, 1998
BLLDB
Show details

Page: 1 2

Catalogues
3
1
1
0
0
0
0
Bibliographies
6
0
1
0
0
0
0
0
3
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
16
0
1
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern