41 |
Crowdsourcing for language resource development: critical analysis of amazon mechanical turk overpowering use
|
|
|
|
In: Language & Technology Conference : Human Language Technologies as a Challenge for Computer Science and Linguistics ; https://hal.archives-ouvertes.fr/hal-01840838 ; Language & Technology Conference : Human Language Technologies as a Challenge for Computer Science and Linguistics, Jan 2011, Poznan, Poland (2011)
|
|
BASE
|
|
Show details
|
|
42 |
Pronunciation and Writing Variants in an Under-Resourced Language: The Case of Luxembourgish Mobile N-Deletion
|
|
|
|
In: Human Language Technology. Challenges for Computer Science and Linguistics ; https://hal.archives-ouvertes.fr/hal-01135097 ; Zygmunt Vetulani. Human Language Technology. Challenges for Computer Science and Linguistics, 6562, Springer Berlin Heidelberg, pp.70-8-1, 2011, 4th Language and Technology Conference, LTC 2009, Poznan, Poland, November 6-8, 2009, Revised Selected Papers, 978-3-642-20094-6. ⟨10.1007/978-3-642-20095-3_7⟩ (2011)
|
|
BASE
|
|
Show details
|
|
43 |
LIMSI @ WMT11
|
|
|
|
In: WMT 2011. Sixth Workshop on Statistical Machine Translation : Proceedings of the Workshop ; Sixth Workshop on Statistical Machine Translation (WMT 11) ; https://hal.archives-ouvertes.fr/hal-00644047 ; Sixth Workshop on Statistical Machine Translation (WMT 11), Jul 2011, Edinburgh, United Kingdom. pp.309-315 (2011)
|
|
BASE
|
|
Show details
|
|
44 |
Crowdsourcing for Language Resource Development: Critical Analysis of Amazon Mechanical Turk Overpowering Use
|
|
|
|
In: LTC 2011 : Proceedings of the 5th Language and Technology Conference ; 5th Language and Technology Conference ; https://hal.archives-ouvertes.fr/hal-00648187 ; 5th Language and Technology Conference, Nov 2011, Poznan, Poland (2011)
|
|
Abstract:
International audience ; This article is a position paper about crowdsourced microworking systems and especially Amazon Mechanical Turk, the use of which has been steadily growing in language processing in the past few years. According to the mainstream opinion expressed in the articles of the domain, this type of on-line working platforms allows to develop very quickly all sorts of quality language resources, for a very low price, by people doing that as a hobby or wanting some extra cash. We shall demonstrate here that the situation is far from being that ideal, be it from the point of view of quality, price, workers' status or ethics and bring back to mind already existing or proposed alternatives. Our goal here is threefold: 1 - to inform researchers, so that they can make their own choices with all the elements of the reflection in mind, 2- to ask for help from funding agencies and scientific associations, and develop alternatives, 3- to propose practical and organizational solutions in order to improve new language resources development, while limiting the risks of ethical and legal issues without letting go price or quality.
|
|
Keyword:
[INFO.INFO-TT]Computer Science [cs]/Document and Text Processing
|
|
URL: https://hal.archives-ouvertes.fr/hal-00648187/file/ltc-56-adda_final.pdf https://hal.archives-ouvertes.fr/hal-00648187 https://hal.archives-ouvertes.fr/hal-00648187/document
|
|
BASE
|
|
Hide details
|
|
45 |
Amazon Mechanical Turk: Gold Mine or Coal Mine?
|
|
|
|
In: ISSN: 0891-2017 ; EISSN: 1530-9312 ; Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-00569450 ; Computational Linguistics, Massachusetts Institute of Technology Press (MIT Press), 2011, pp.413-420. ⟨10.1162/COLI_a_00057⟩ (2011)
|
|
BASE
|
|
Show details
|
|
46 |
Question answering on web data : the QA evaluation in Quaero
|
|
|
|
In: Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC’10) ; International Conference on Language Resources and Evaluation ; https://hal.archives-ouvertes.fr/hal-02282126 ; International Conference on Language Resources and Evaluation, Jan 2010, Valetta, Malta (2010)
|
|
BASE
|
|
Show details
|
|
47 |
Annotation and analysis of overlapping speech in political interviews
|
|
|
|
In: LREC 2008 ; https://hal.archives-ouvertes.fr/hal-01690328 ; LREC 2008, May 2008, Marrakech, Morocco (2008)
|
|
BASE
|
|
Show details
|
|
48 |
CallSurf - Automatic transcription, indexing and structuration of call center conversational speech for knowledge extraction and query by content
|
|
|
|
In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08) ; Sixth International Conference on Language Resources and Evaluation (LREC'08) ; https://hal.archives-ouvertes.fr/hal-00716016 ; Sixth International Conference on Language Resources and Evaluation (LREC'08), May 2008, Marrakech, Morocco. pp.2623-2628 (2008)
|
|
BASE
|
|
Show details
|
|
50 |
Advances in Transcription of Broadcast News and Conversational Telephone Speech Within the Combined EARS BBN/LIMSI System
|
|
|
|
In: ISSN: 1558-7916 ; IEEE Transactions on Audio, Speech and Language Processing ; https://hal.archives-ouvertes.fr/hal-01299058 ; IEEE Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2006 (2006)
|
|
BASE
|
|
Show details
|
|
52 |
A quantitative study of disfluencies in French broadcast interviews
|
|
|
|
In: Proceedings of DISS'05 (Disfluency in Spontaneous Speech) ; DISS'05 (Disfluency in Spontaneous Speech) ; https://halshs.archives-ouvertes.fr/halshs-00399001 ; DISS'05 (Disfluency in Spontaneous Speech), Sep 2005, aix-en-provence, France. pp.27-32 (2005)
|
|
BASE
|
|
Show details
|
|
|
|