DE eng

Search in the Catalogues and Directories

Hits 1 – 10 of 10

1
Pluricentric languages : automatic identification and linguistic variation ... : Plurizentrische Sprachen : automatische Spracherkennung und linguistische Variation ...
Zampieri, Marcos. - : Universität des Saarlandes, 2016
BASE
Show details
2
Digital Humanities, Computational Linguistics, And Natural Language Processing ...
Piotrowski, Michael. - : Zenodo, 2016
BASE
Show details
3
Digital Humanities, Computational Linguistics, And Natural Language Processing ...
Piotrowski, Michael. - : Zenodo, 2016
BASE
Show details
4
Language model driven analysis : simplifying text on an individual scale ... : Benutzerzentrierte Modelle - Versuch unbekannte Wörter zu finden ...
Strelzow, Alexej. - : TU Wien, 2016
BASE
Show details
5
Data Cleaning for XML Electronic Dictionaries via Statistical Anomaly Detection ...
Bloodgood, Michael; Strauss, Benjamin. - : Digital Repository at the University of Maryland, 2016
BASE
Show details
6
Pluricentric languages : automatic identification and linguistic variation ; Plurizentrische Sprachen : automatische Spracherkennung und linguistische Variation
BASE
Show details
7
DOCREP: Document Representation for Natural Language Processing
Dawborn, Timothy James. - : The University of Sydney, 2016. : Faculty of Engineering and Information Technologies, School of Information Technologies, 2016
BASE
Show details
8
Evaluating Parsers with Dependency Constraints
Ng, Dominick. - : The University of Sydney, 2016. : Faculty of Engineering and Information Technologies, School of Information Technologies, 2016
Abstract: Many syntactic parsers now score over 90% on English in-domain evaluation, but the remaining errors have been challenging to address and difficult to quantify. Standard parsing metrics provide a consistent basis for comparison between parsers, but do not illuminate what errors remain to be addressed. This thesis develops a constraint-based evaluation for dependency and Combinatory Categorial Grammar (CCG) parsers to address this deficiency. We examine the constrained and cascading impact, representing the direct and indirect effects of errors on parsing accuracy. This identifies errors that are the underlying source of problems in parses, compared to those which are a consequence of those problems. Kummerfeld et al. (2012) propose a static post-parsing analysis to categorise groups of errors into abstract classes, but this cannot account for cascading changes resulting from repairing errors, or limitations which may prevent the parser from applying a repair. In contrast, our technique is based on enforcing the presence of certain dependencies during parsing, whilst allowing the parser to choose the remainder of the analysis according to its grammar and model. We draw constraints for this process from gold-standard annotated corpora, grouping them into abstract error classes such as NP attachment, PP attachment, and clause attachment. By applying constraints from each error class in turn, we can examine how parsers respond when forced to correctly analyse each class. We show how to apply dependency constraints in three parsers: the graph-based MSTParser (McDonald and Pereira, 2006) and the transition-based ZPar (Zhang and Clark, 2011b) dependency parsers, and the C&C CCG parser (Clark and Curran, 2007b). Each is widely-used and influential in the field, and each generates some form of predicate-argument dependencies. We compare the parsers, identifying common sources of error, and differences in the distribution of errors between constrained and cascaded impact. Our work allows us to contrast the implementations of each parser, and how they respond to constraint application. Using our analysis, we experiment with new features for dependency parsing, which encode the frequency of proposed arcs in large-scale corpora derived from scanned books. These features are inspired by and extend on the work of Bansal and Klein (2011). We target these features at the most notable errors, and show how they address some, but not all of the difficult attachments across newswire and web text. CCG parsing is particularly challenging, as different derivations do not always generate different dependencies. We develop dependency hashing to address semantically redundant parses in n-best CCG parsing, and demonstrate its necessity and effectiveness. Dependency hashing substantially improves the diversity of n-best CCG parses, and improves a CCG reranker when used for creating training and test data. We show the intricacies of applying constraints to C&C, and describe instances where applying constraints causes the parser to produce a worse analysis. These results illustrate how algorithms which are relatively straightforward for constituency and dependency parsers are non-trivial to implement in CCG. This work has explored dependencies as constraints in dependency and CCG parsing. We have shown how dependency hashing can efficiently eliminate semantically redundant CCG n-best parses, and presented a new evaluation framework based on enforcing the presence of dependencies in the output of the parser. By otherwise allowing the parser to proceed as it would have, we avoid the assumptions inherent in other work. We hope this work will provide insights into the remaining errors in parsing, and target efforts to address those errors, creating better syntactic analysis for downstream applications.
Keyword: CCG parsing; computational linguistics; dependency parsing; natural language processing; parsing
URL: http://hdl.handle.net/2123/14550
BASE
Hide details
9
Data Cleaning for XML Electronic Dictionaries via Statistical Anomaly Detection
BASE
Show details
10
Compiling Specialised Comparable Corpora. Should we always trust (Semi-)automatic Compilation Tools?
In: Linguamática, Vol 8, Iss 1 (2016) (2016)
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
10
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern