1 |
Quality and Efficiency of Manual Annotation: Data from the Pre-annotation Bias Experiment (part of the PDT-C 2.0 project)
|
|
|
|
BASE
|
|
Show details
|
|
2 |
A clinical trials corpus annotated with UMLS entities to enhance the access to evidence-based medicine
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Constituting the Democrat Corpus: Annotation and Evaluation Procedures ; Élaboration du corpus Democrat : procédures d’annotation et d’évaluation
|
|
|
|
In: ISSN: 0458-726X ; EISSN: 1958-9549 ; Langages ; https://hal.archives-ouvertes.fr/hal-03474329 ; Langages, Armand Colin (Larousse jusqu'en 2003), 2021, Un corpus annoté en chaînes de référence et son exploitation : le projet Democrat, pp.25-46 ; https://www.revues.armand-colin.com/lettres-langues/langages/langages-no-224-42021/elaboration-du-corpus-democrat-procedures-dannotation-devaluation (2021)
|
|
Abstract:
International audience ; There already exists several corpora that have been manually annotated in referring expressions and coreference chains. Nevertheless, none of them focuses on French language (or for annotations that are related to anaphora more than coreference). The Democrat project has produced such a corpus, with also a diachronic dimension. Its conception raised numerous difficulties, not only linguistic, but also in terms of the homogeneity of the annotations, as well as their verification and the evaluation of their quality. It is this dimension that we explore and discuss here, including concerns about annotation conventions and the evaluation of the annotations, a procedure involving the computation of the inter-annotators agreement. Thus, this article discusses the constitution and content of the Democrat corpus, in order to legitimise the exploitations that will be made of it. ; S’il existe déjà plusieurs corpus annotés manuellement en expressions référentielles et en chaînes de référence, il n’en existe aucun pour la langue française, ou alors pour des annotations qui relèvent plus de l’anaphore que de la coréférence. Le projet Democrat a produit un tel corpus, avec qui plus est une dimension diachronique. Sa conception a posé un ensemble de difficultés, non seulement linguistiques, mais aussi au niveau de l’homogénéité des annotations, de leur vérification et de l’évaluation de leur qualité. C’est cette dimension que nous proposons ici d’explorer et de discuter, en nous focalisant sur les conventions d’annotation et l’évaluation des annotations obtenues, procédure impliquant un calcul de l’accord inter-annotateurs. Cet article met ainsi en perspective le contenu du corpus Democrat, pour légitimer les exploitations qui en seront faites.
|
|
Keyword:
[SHS.LANGUE]Humanities and Social Sciences/Linguistics; accord inter-annotateurs; annotation evaluation; annotation guide; Annotation manuelle; évaluation des annotations; evaluation metrics; guide d’annotation; inter-annotator agreement; Manual annotation; métriques d’évaluation
|
|
URL: https://hal.archives-ouvertes.fr/hal-03474329
|
|
BASE
|
|
Hide details
|
|
4 |
pygamma-agreement: Gamma measure for inter/intra-annotator agreement in Python ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
pygamma-agreement: Gamma measure for inter/intra-annotator agreement in Python ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Inter-annotator agreement in spoken language annotation: Applying uα-family coefficients to discourse segmentation
|
|
|
|
In: Russian Journal of Linguistics, Vol 25, Iss 2, Pp 478-506 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
7 |
Annotation des proéminences pour la segmentation de corpus oraux : l’expérience du projet SegCor
|
|
|
|
In: CMLF 2018 - 6e Congrès Mondial de Linguistique Française ; https://halshs.archives-ouvertes.fr/halshs-01839314 ; CMLF 2018 - 6e Congrès Mondial de Linguistique Française, Franck Neveu; Bernard Harmegnies; Linda Hriba; Sophie Prévost, Jul 2018, Mons, Belgique (2018)
|
|
BASE
|
|
Show details
|
|
8 |
Annotation of prominences for the segmentation of oral corpora: the Segcor project experiment ; Annotation des proéminences pour la segmentation de corpus oraux : l'expérience du projet SegCor
|
|
|
|
In: ISSN: 2261-2424 ; SHS Web of Conferences ; https://hal.archives-ouvertes.fr/hal-01959747 ; SHS Web of Conferences, EDP Sciences, 2018, 6e Congrès Mondial de Linguistique Française, 46 (2018)
|
|
BASE
|
|
Show details
|
|
10 |
A contribution to Computational Linguistics and Natural Language Processing: From the Semantics of Space and Time to Annotations and Agreement Measures
|
|
|
|
In: https://hal.archives-ouvertes.fr/tel-01713846 ; Artificial Intelligence [cs.AI]. Université de Caen Normandie, 2017 (2017)
|
|
BASE
|
|
Show details
|
|
11 |
A French clinical corpus with comprehensive semantic annotations: development of the Medical Entity and Relation LIMSI annOtated Text corpus (MERLOT)
|
|
|
|
In: ISSN: 1574-020X ; EISSN: 1574-0218 ; Language Resources and Evaluation ; https://hal.archives-ouvertes.fr/hal-01631743 ; Language Resources and Evaluation, Springer Verlag, 2017, 52 (2), pp.571-601. ⟨10.1007/s10579-017-9382-y⟩ (2017)
|
|
BASE
|
|
Show details
|
|
12 |
A Large Rated Lexicon with French Medical Words
|
|
|
|
In: LREC (Language Resources and Evaluation Conference) 2016 ; https://hal.archives-ouvertes.fr/hal-01426790 ; LREC (Language Resources and Evaluation Conference) 2016, May 2016, Portorož, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
14 |
Inter-annotator agreement for a speech corpus pronounced by French and German language learners
|
|
|
|
In: Workshop on Speech and Language Technology in Education ; https://hal.archives-ouvertes.fr/hal-01185194 ; Workshop on Speech and Language Technology in Education, ISCA Special Interest Group (SIG) on Speech and Language Technology in Education, Sep 2015, Leipzig, Germany (2015)
|
|
BASE
|
|
Show details
|
|
15 |
The GV-LEx corpus of tales in French ; The GV-LEx corpus of tales in French: Text and speech corpora enriched with lexical, discourse, structural, phonemic and prosodic annotations
|
|
|
|
In: ISSN: 1574-020X ; EISSN: 1574-0218 ; Language Resources and Evaluation ; https://halshs.archives-ouvertes.fr/halshs-01251140 ; Language Resources and Evaluation, Springer Verlag, 2015, 49 (3), pp.521-547. ⟨10.1007/s10579-015-9306-7⟩ (2015)
|
|
BASE
|
|
Show details
|
|
16 |
The Unified and Holistic Method Gamma for Inter-annotator Agreement Measure and Alignment
|
|
|
|
In: ISSN: 0891-2017 ; EISSN: 1530-9312 ; Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-01145352 ; Computational Linguistics, Massachusetts Institute of Technology Press (MIT Press), 2015, 41 (3), pp.437-479 (2015)
|
|
BASE
|
|
Show details
|
|
17 |
The System of Register Labels in plWordNet
|
|
|
|
In: Cognitive Studies | Études cognitives; No 15 (2015); 161-175 ; 2392-2397 (2015)
|
|
BASE
|
|
Show details
|
|
18 |
Correcting and Validating Syntactic Dependency in the Spoken French Treebank Rhapsodie
|
|
|
|
In: Proceedings of the 9th Language Resources and Evaluation Conference (LREC) ; https://halshs.archives-ouvertes.fr/halshs-01011059 ; Proceedings of the 9th Language Resources and Evaluation Conference (LREC), 2014, Iceland. pp.1-6 (2014)
|
|
BASE
|
|
Show details
|
|
19 |
Manual Corpus Annotation: Giving Meaning to the Evaluation Metrics
|
|
|
|
In: Proceedings of the International Conference on Computational Linguistics (COLING 2012) ; International Conference on Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-00769639 ; International Conference on Computational Linguistics, Dec 2012, Mumbaï, India. pp.809--818 (2012)
|
|
BASE
|
|
Show details
|
|
20 |
Annotating Cognates and Etymological Origin in Turkic Languages
|
|
|
|
BASE
|
|
Show details
|
|
|
|