1 |
Text+: Language- and text-based Research Data Infrastructure ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Text+: Language- and text-based Research Data Infrastructure ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Text+: Language- and text-based Research Data Infrastructure ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
From CLEF to TrebleCLEF: the Evolution of the Cross-Language Evaluation Forum
|
|
|
|
In: http://www.mt-archive.info/NTCIR-2008-Ferro.pdf (2008)
|
|
BASE
|
|
Show details
|
|
6 |
CLEF: Ongoing Activities and Plans for the Future
|
|
|
|
In: http://www.mt-archive.info/NTCIR-2007-Agosti.pdf (2007)
|
|
BASE
|
|
Show details
|
|
7 |
NTCIR CLIR Experiments at the University of Maryland
|
|
|
|
In: DTIC (2000)
|
|
BASE
|
|
Show details
|
|
8 |
CLEF: Ongoing Activities and Plans for the Future
|
|
|
|
In: http://research.nii.ac.jp/ntcir/workshop/OnlineProceedings6/NTCIR/85.pdf
|
|
BASE
|
|
Show details
|
|
9 |
Cross-Language Person-Entity Linking from Twenty Languages
|
|
|
|
In: http://www.ece.umd.edu/~oard/pdf/jasist14lawrie.pdf
|
|
BASE
|
|
Show details
|
|
10 |
Abstract Testing the Reasoning for Question Answering Validation
|
|
|
|
In: http://nlp.uned.es/pergamus2/jlogcomp_draft.pdf
|
|
Abstract:
Question Answering (QA) is a task that deserves more collaboration between Natural Language Processing (NLP) and Knowledge Representation (KR) communities, not only to introduce reasoning when looking for answers or making use of answer type taxonomies and encyclopedic knowledge, but also, as discussed here, for Answer Validation (AV), that is to say, to decide whether the responses of a QA system are correct or not. This was one of the motivations for the first Answer Validation Exercise at CLEF 2006 (AVE 2006). The starting point for the AVE 2006 was the reformulation of the Answer Validation as a Recognizing Textual Entailment (RTE) problem, under the assumption that a hypothesis can be automatically generated instantiating a hypothesis pattern with a QA system answer. The test collections that we developed in seven different languages at AVE 2006 are specially oriented to the development and evaluation of Answer Validation systems. We show in this article the methodology followed for developing these collections taking advantage of the human assessments already made in the evaluation of QA systems. We also propose an evaluation framework for AV linked to a QA evaluation track. We quantify and discuss the source of errors introduced by the reformulation of the Answer Validation problem in terms of Textual Entailment (around 2%, in the range of inter-annotator disagreement). We also show the evaluation results of the first Answer Validation Exercise at CLEF 2006 where 11 groups have participated with 38 runs in 7 different languages. The most extensively used techniques were Machine Learning and overlapping measures, but systems with broader knowledge resources and richer representation formalisms obtained the best results.
|
|
Keyword:
Answer Validation; Evaluation 1 2; Question Answering; Test Collections; Textual Entailment
|
|
URL: http://nlp.uned.es/pergamus2/jlogcomp_draft.pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.96.1256
|
|
BASE
|
|
Hide details
|
|
|
|