5 |
Crowdsourcing and Aggregating Nested Markable Annotations ...
|
|
|
|
Abstract:
One of the key steps in language resource creation is the identification of the text segments to be annotated, or markables, which depending on the task may vary from nominal chunks for named entity resolution to (potentially nested) noun phrases in coreference resolution (or mentions) to larger text segments in text segmentation. Markable identification is typically carried out semi-automatically, by running a markable identifier and correcting its output by hand—which is increasingly done via annotators recruited through crowdsourcing and aggregating their responses. In this paper, we present a method for identifying markables for coreference annotation that combines high-performance automatic markable detectors with checking with a Game-With-A-Purpose (GWAP) and aggregation using a Bayesian annotation model. The method was evaluated both on news data and data from a variety of other genres and results in an improvement on F1 of mention boundaries of over seven percentage points when compared with a ...
|
|
Keyword:
020 Bibliotheks- und Informationswissenschaft
|
|
URL: https://epub.uni-regensburg.de/id/eprint/43402 https://dx.doi.org/10.5283/epub.43402
|
|
BASE
|
|
Hide details
|
|
7 |
A Crowdsourced Corpus of Multiple Judgments and Disagreement on Anaphoric Interpretation
|
|
|
|
BASE
|
|
Show details
|
|
9 |
A Crowdsourced Corpus of Multiple Judgments and Disagreement on Anaphoric Interpretation
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Exploring Language Style in Chatbots to Increase Perceived Product Value and User Engagement
|
|
|
|
BASE
|
|
Show details
|
|
11 |
A Probabilistic Annotation Model for Crowdsourcing Coreference
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Markup Infrastructure for the Anaphoric Bank: Supporting Web Collaboration
|
|
|
|
BASE
|
|
Show details
|
|
|
|