41 |
Crowdsourcing for language resource development: critical analysis of amazon mechanical turk overpowering use
|
|
|
|
In: Language & Technology Conference : Human Language Technologies as a Challenge for Computer Science and Linguistics ; https://hal.archives-ouvertes.fr/hal-01840838 ; Language & Technology Conference : Human Language Technologies as a Challenge for Computer Science and Linguistics, Jan 2011, Poznan, Poland (2011)
|
|
BASE
|
|
Show details
|
|
42 |
Pronunciation and Writing Variants in an Under-Resourced Language: The Case of Luxembourgish Mobile N-Deletion
|
|
|
|
In: Human Language Technology. Challenges for Computer Science and Linguistics ; https://hal.archives-ouvertes.fr/hal-01135097 ; Zygmunt Vetulani. Human Language Technology. Challenges for Computer Science and Linguistics, 6562, Springer Berlin Heidelberg, pp.70-8-1, 2011, 4th Language and Technology Conference, LTC 2009, Poznan, Poland, November 6-8, 2009, Revised Selected Papers, 978-3-642-20094-6. ⟨10.1007/978-3-642-20095-3_7⟩ (2011)
|
|
Abstract:
International audience ; The national language of the Grand-Duchy of Luxembourg, Luxembourgish, has often been characterized as one of Europe’s under-described and under-resourced languages. Because of a limited written production of Luxembourgish, poorly observed writing standardization (as compared to other languages such as English and French) and a large diversity of spoken varieties, the study of Luxembourgish poses many interesting challenges to automatic speech processing studies as well as to linguistic enquiries. In the present paper, we make use of large corpora to focus on typical writing and derived pronunciation variants in Luxembourgish, elicited by mobile -n deletion (hereafter shortened to MND). Using transcriptions from the House of Parliament debates and 10k words from news reports, we examine the reality of MND variants in written transcripts of speech. The goal of this study is manyfold: quantify the potential of variation due to MND in written Luxembourgish, check the mandatory status of the MND rule and discuss the arising problems for automatic spoken Luxembourgish processing.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [SHS.LANGUE]Humanities and Social Sciences/Linguistics; low e-resourced languages; Luxembourgish; pronunciation variants; writing variants
|
|
URL: https://doi.org/10.1007/978-3-642-20095-3_7 https://hal.archives-ouvertes.fr/hal-01135097
|
|
BASE
|
|
Hide details
|
|
43 |
LIMSI @ WMT11
|
|
|
|
In: WMT 2011. Sixth Workshop on Statistical Machine Translation : Proceedings of the Workshop ; Sixth Workshop on Statistical Machine Translation (WMT 11) ; https://hal.archives-ouvertes.fr/hal-00644047 ; Sixth Workshop on Statistical Machine Translation (WMT 11), Jul 2011, Edinburgh, United Kingdom. pp.309-315 (2011)
|
|
BASE
|
|
Show details
|
|
44 |
Crowdsourcing for Language Resource Development: Critical Analysis of Amazon Mechanical Turk Overpowering Use
|
|
|
|
In: LTC 2011 : Proceedings of the 5th Language and Technology Conference ; 5th Language and Technology Conference ; https://hal.archives-ouvertes.fr/hal-00648187 ; 5th Language and Technology Conference, Nov 2011, Poznan, Poland (2011)
|
|
BASE
|
|
Show details
|
|
45 |
Amazon Mechanical Turk: Gold Mine or Coal Mine?
|
|
|
|
In: ISSN: 0891-2017 ; EISSN: 1530-9312 ; Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-00569450 ; Computational Linguistics, Massachusetts Institute of Technology Press (MIT Press), 2011, pp.413-420. ⟨10.1162/COLI_a_00057⟩ (2011)
|
|
BASE
|
|
Show details
|
|
46 |
Question answering on web data : the QA evaluation in Quaero
|
|
|
|
In: Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC’10) ; International Conference on Language Resources and Evaluation ; https://hal.archives-ouvertes.fr/hal-02282126 ; International Conference on Language Resources and Evaluation, Jan 2010, Valetta, Malta (2010)
|
|
BASE
|
|
Show details
|
|
47 |
Annotation and analysis of overlapping speech in political interviews
|
|
|
|
In: LREC 2008 ; https://hal.archives-ouvertes.fr/hal-01690328 ; LREC 2008, May 2008, Marrakech, Morocco (2008)
|
|
BASE
|
|
Show details
|
|
48 |
CallSurf - Automatic transcription, indexing and structuration of call center conversational speech for knowledge extraction and query by content
|
|
|
|
In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08) ; Sixth International Conference on Language Resources and Evaluation (LREC'08) ; https://hal.archives-ouvertes.fr/hal-00716016 ; Sixth International Conference on Language Resources and Evaluation (LREC'08), May 2008, Marrakech, Morocco. pp.2623-2628 (2008)
|
|
BASE
|
|
Show details
|
|
49 |
Advances in Transcription of Broadcast News and Conversational Telephone Speech Within the Combined EARS BBN/LIMSI System
|
|
|
|
In: ISSN: 1558-7916 ; IEEE Transactions on Audio, Speech and Language Processing ; https://hal.archives-ouvertes.fr/hal-01299058 ; IEEE Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2006 (2006)
|
|
BASE
|
|
Show details
|
|
50 |
A quantitative study of disfluencies in French broadcast interviews
|
|
|
|
In: Proceedings of DISS'05 (Disfluency in Spontaneous Speech) ; DISS'05 (Disfluency in Spontaneous Speech) ; https://halshs.archives-ouvertes.fr/halshs-00399001 ; DISS'05 (Disfluency in Spontaneous Speech), Sep 2005, aix-en-provence, France. pp.27-32 (2005)
|
|
BASE
|
|
Show details
|
|
|
|