21 |
Que fait et peut faire la communauté scientifique ?
|
|
|
|
In: Les technologies pour les langues régionales de France ; https://hal.archives-ouvertes.fr/hal-01280449 ; Les technologies pour les langues régionales de France, Feb 2015, Meudon, France. DGLFLF, pp.139-145, 2016 ; http://webcast.in2p3.fr/videos-tlrf_table_ronde_1 (2016)
|
|
BASE
|
|
Show details
|
|
22 |
Breaking the unwritten language barrier: the BULB project
|
|
|
|
In: SLTU-2016 5th Workshop on Spoken Language Technologies for Under-resourced languages ; https://halshs.archives-ouvertes.fr/halshs-01428027 ; SLTU-2016 5th Workshop on Spoken Language Technologies for Under-resourced languages, May 2016, Yogyakarta, Indonesia. ⟨10.1016/j.procs.2016.04.023⟩ (2016)
|
|
BASE
|
|
Show details
|
|
23 |
Innovative technologies for under-resourced language documentation: The BULB Project
|
|
|
|
In: CCURL proceedings ; Workshop CCURL 2016 - Collaboration and Computing for Under-Resourced Languages - LREC ; https://hal.archives-ouvertes.fr/hal-01350124 ; Workshop CCURL 2016 - Collaboration and Computing for Under-Resourced Languages - LREC, May 2016, Portoroz, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
24 |
BULB: Breaking the Unwritten Language Barrier
|
|
|
|
In: Procedia Computer Science ; Computational Methods for Endangered Language Documentation and Description ; https://hal.archives-ouvertes.fr/hal-01836496 ; Computational Methods for Endangered Language Documentation and Description, May 2016, Yogyakarta, Indonesia. pp.8-14, ⟨10.1016/j.procs.2016.04.023⟩ (2016)
|
|
BASE
|
|
Show details
|
|
25 |
Ethical Issues in Corpus Linguistics And Annotation: Pay Per Hit Does Not Affect Effective Hourly Rate For Linguistic Resource Development On Amazon Mechanical Turk
|
|
|
|
BASE
|
|
Show details
|
|
26 |
Faire du TAL sur des données personnelles : un oxymore ?
|
|
|
|
In: TALN 2015 ; https://hal.archives-ouvertes.fr/hal-01171519 ; TALN 2015, ATALA, Jun 2015, Caen, France ; https://taln2015.greyc.fr (2015)
|
|
BASE
|
|
Show details
|
|
27 |
Automatic language identity tagging on word and sentence-level in multilingual text sources: a case-study on Luxembourgish
|
|
|
|
In: International Conference on Language Resources and Evaluation ; https://hal.archives-ouvertes.fr/hal-01843401 ; International Conference on Language Resources and Evaluation, May 2014, Reykjavik, Iceland (2014)
|
|
BASE
|
|
Show details
|
|
28 |
Automatic Language Identity Tagging on Word and Sentence-Level in Multilingual Text Sources: a Case-Study on Luxembourgish
|
|
|
|
In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14) ; Ninth International Conference on Language Resources and Evaluation (LREC'14) ; https://hal.archives-ouvertes.fr/hal-01134776 ; Ninth International Conference on Language Resources and Evaluation (LREC'14), European Language Resources Association (ELRA), May 2014, Reykjavik, Iceland. pp.3300-3304 ; http://lrec2014.lrec-conf.org/en/ (2014)
|
|
BASE
|
|
Show details
|
|
29 |
Evaluating Corpora Documentation with regards to the Ethics and Big Data Charter
|
|
|
|
In: International Conference on Language Resources and Evaluation (LREC) ; https://hal.inria.fr/hal-00969180 ; International Conference on Language Resources and Evaluation (LREC), May 2014, Reykjavik, Iceland (2014)
|
|
BASE
|
|
Show details
|
|
30 |
"Where the data are coming from?" Ethics, crowdsourcing and traceability for Big Data in Human Language Technology
|
|
|
|
In: Crowdsourcing and human computation multidisciplinary workshop ; https://hal.archives-ouvertes.fr/hal-01078045 ; Crowdsourcing and human computation multidisciplinary workshop, CNRS, Sep 2014, Paris, France (2014)
|
|
BASE
|
|
Show details
|
|
31 |
Crowdsourcing for Language Resource Development: Criticisms About Amazon Mechanical Turk Overpowering Use
|
|
|
|
In: Human Language Technology Challenges for Computer Science and Linguistics ; https://hal.inria.fr/hal-01053047 ; Vetulani, Zygmunt and Mariani, Joseph. Human Language Technology Challenges for Computer Science and Linguistics, 8387, Springer International Publishing, pp.303-314, 2014, Lecture Notes in Computer Science, 978-3-319-08957-7. ⟨10.1007/978-3-319-08958-4_25⟩ (2014)
|
|
BASE
|
|
Show details
|
|
32 |
Crowdsourcing for Speech: Economic, Legal and Ethical analysis
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-01067110 ; [Research Report] LIG lab. 2014 (2014)
|
|
BASE
|
|
Show details
|
|
33 |
Modélisation acoustico-phonétique de langues peu dotées : Études phonétiques et travaux de reconnaissance automatique en luxembourgois
|
|
|
|
In: Journées d'Etude sur la Parole ; https://hal.archives-ouvertes.fr/hal-01843399 ; Journées d'Etude sur la Parole, Jan 2014, Le Mans, France (2014)
|
|
BASE
|
|
Show details
|
|
34 |
Speech Alignment and Recognition Experiments for Luxembourgish
|
|
|
|
In: Proceedings of the 4th International Workshop on Spoken Language Technologies for Underresourced Languages ; 4th International Workshop on Spoken Language Technologies for Underresourced Languages ; https://hal.archives-ouvertes.fr/hal-01134824 ; 4th International Workshop on Spoken Language Technologies for Underresourced Languages, May 2014, Saint-Petersbourg, Russia. pp.53-60 ; http://www.mica.edu.vn/sltu2014/ (2014)
|
|
BASE
|
|
Show details
|
|
35 |
A First LVCSR System for Luxembourgish, a Low-Resourced European Language
|
|
|
|
In: Human Language Technology Challenges for Computer Science and Linguistics ; https://hal.archives-ouvertes.fr/hal-01135103 ; Zygmunt Vetulani; Joseph Mariani. Human Language Technology Challenges for Computer Science and Linguistics, 8387, Springer International Publishing, pp.479-490, 2014, 5th Language and Technology Conference, LTC 2011, Poznań, Poland, November 25--27, 2011, Revised Selected Papers, 978-3-319-08957-7. ⟨10.1007/978-3-319-08958-4_39⟩ (2014)
|
|
BASE
|
|
Show details
|
|
36 |
What we can learn from ASR errors about low-resourced languages: a case- study of Luxembourgish and Austrian
|
|
|
|
In: Errors by Humans and Machines in Multimedia, Multimodal, Multilingual Data Processing ; https://hal.archives-ouvertes.fr/hal-01843440 ; Errors by Humans and Machines in Multimedia, Multimodal, Multilingual Data Processing, Jan 2013, Ermenonville, France (2013)
|
|
BASE
|
|
Show details
|
|
37 |
What we can learn from asr errors about low-resourced languages: a case-study of luxembourgish and austrian
|
|
|
|
In: Errors by Humans and Machines in Multimedia, Multimodal, Multilingual Data Processing (ERRARE 2013) ; https://halshs.archives-ouvertes.fr/halshs-01424902 ; Errors by Humans and Machines in Multimedia, Multimodal, Multilingual Data Processing (ERRARE 2013), Nov 2013, Ermenonville, France (2013)
|
|
BASE
|
|
Show details
|
|
38 |
Systèmes de transcription comme instruments
|
|
|
|
In: Méthodes et outils pour l'analyse phonétique des grands corpus oraux ; https://hal.archives-ouvertes.fr/hal-01135113 ; Nguyen Noël; Adda-Decker Martine. Méthodes et outils pour l'analyse phonétique des grands corpus oraux, Hermes Science Publications, pp.159-202, 2013, Cognition et Traitement de l'Information, 978-2746245303 (2013)
|
|
BASE
|
|
Show details
|
|
39 |
Une étude quantitative des marqueurs discursifs, disfluences et chevauchements de parole dans des interviews politiques
|
|
|
|
In: ISSN: 2118-870X ; EISSN: 2264-7082 ; Travaux Interdisciplinaires du Laboratoire Parole et Langage d'Aix-en-Provence (TIPA) ; https://hal.archives-ouvertes.fr/hal-01135042 ; Travaux Interdisciplinaires du Laboratoire Parole et Langage d'Aix-en-Provence (TIPA), Laboratoire Parole et Langage, 2013, pp.18. ⟨10.4000/tipa.830⟩ (2013)
|
|
BASE
|
|
Show details
|
|
40 |
Un turc mécanique pour les ressources linguistiques : critique de la myriadisation du travail parcellisé
|
|
|
|
In: TALN'2011 - Traitement Automatique des Langues Naturelles ; https://hal.inria.fr/inria-00617067 ; TALN'2011 - Traitement Automatique des Langues Naturelles, Jun 2011, Montpellier, France (2011)
|
|
Abstract:
International audience ; This article is a position paper concerning Amazon Mechanical Turk-like systems, the use of which has been steadily growing in natural language processing in the past few years. According to the mainstream opinion expressed in the articles of the domain, these online working platforms allow to develop very quickly all sorts of quality language resources, for a very low price, by people doing that as a hobby. We shall demonstrate here that the situation is far from being that ideal, be it from the point of view of quality, price, workers' status or ethics. We shall then bring back to mind already existing or proposed alternatives. Our goal here is twofold : to inform researchers, so that they can make their own choices with all the elements of the reflection in mind, and propose practical and organizational solutions in order to improve new language resources development, while limiting the risks of ethical and legal issues without letting go price or quality. ; Cet article est une prise de position concernant les plate-formes de type Amazon Mechanical Turk, dont l'utilisation est en plein essor depuis quelques années dans le traitement automatique des langues. Ces plateformes de travail en ligne permettent, selon le discours qui prévaut dans les articles du domaine, de faire développer toutes sortes de ressources linguistiques de qualité, pour un prix imbattable et en un temps très réduit, par des gens pour qui il s'agit d'un passe-temps. Nous allons ici démontrer que la situation est loin d'être aussi idéale, que ce soit sur le plan de la qualité, du prix, du statut des travailleurs ou de l'éthique. Nous rappellerons ensuite les solutions alternatives déjà existantes ou proposées. Notre but est ici double : informer les chercheurs, afin qu'ils fassent leur choix en toute connaissance de cause, et proposer des solutions pratiques et organisationnelles pour améliorer le développement de nouvelles ressources linguistiques en limitant les risques de dérives éthiques et légales, sans que cela se fasse au prix de leur coût ou de leur qualité.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; Amazon Mechanical Turk; language resources
|
|
URL: https://hal.inria.fr/inria-00617067/document https://hal.inria.fr/inria-00617067/file/TALN2011-MTurk.pdf https://hal.inria.fr/inria-00617067
|
|
BASE
|
|
Hide details
|
|
|
|