Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5

Hits 1 – 20 of 88

1	Simplification automatique de textes biomédicaux en français : les données précises de petite taille aident
	Cardon, Rémi; Grabar, Natalia
	In: Actes de la 28e Conférence sur le Traitement Automatique des Langues Naturelles. Volume 1 : conférence principale ; TALN - Traitement Automatique des Langues Naturelles ; https://hal.archives-ouvertes.fr/hal-03509735 ; TALN - Traitement Automatique des Langues Naturelles, Jul 2021, Lille, France (2021)
	Abstract: International audience ; We present experiments on automatic biomedical text simplification in French. We work at the sentence level. In this work, we use two corpora :1. 4 596 parallel sentence pairs automatically extracted from a French biomedical corpus (Cardon & Grabar, 2019),2. 297 494 parallel sentence pairs obtained from general language corpus WikiLarge (Zhang & Lapata, 2017), which we have automatically translated from English to French.In order to perform automatic simplification, we use the OpenNMT-py tool (Klein et al., 2017). It was created for machine translation. This tool operates on an encoder-decoder architecture with an attention mechanism. We exploit OpenNMT-py to transform technical sentences into simpler sentences. We train neural models on the parallel corpora, using different ratios of general language and specialized language. Indeed, the volume of data is sufficient for describing general language simplification. Though, the sentences do not describe transformations that are specific to the medical domain. The parallel sentences from the medical domain allow us to fill this gap. We also use a lexicon that maps complex medical terms with laymen paraphrases (7 580 paraphrases for 4 516medical terms). Thus we can perform three series of experiments1. the lexicon is not used and the simplification is only based on the examples from the training corpora ;2. the lexicon is exploited during the simplification phase, where it is used to indicate to the model how to substitute unknown terms that are present in the lexicon ;3. the lexicon is exploited during the training phase, where it is added to the training set, where it complements the parallel corpora.We evaluate the results with three metrics : BLEU (Papineni et al., 2002), SARI (Xu et al., 2016) and Kandel (Kandel & Moles, 1958). The results point out that little specialized data helps significantly the simplification ; Nous présentons des expériences de simplification automatique de textes biomédicaux en français. Nous travaillons au niveau de la phrase. Dans ce travail, nous utilisons deux corpus :1. 4 596 couples de phrases parallèles extraites automatiquement à partir de corpus comparables du domaine de la santé en français (Cardon & Grabar, 2019),2. 297 494 couples de phrases parallèles issues du corpus de langue générale WikiLarge (Zhang &Lapata, 2017), dédié à la simplification, que nous avons traduit automatiquement de l’anglais vers le français.Pour effectuer la simplification automatiquement, nous utilisons l’outil OpenNMT-py (Klein et al., 2017), créé à l’origine pour la traduction bilingue. Le fonctionnement de cet outil est basé sur une architecture encodeur-décoder avec mécanisme d’attention. Nous exploitons OpenNMT-py pour transformer un texte technique en un texte simplifié. Nous entraînons des modèles neuronaux sur les corpus parallèles constitués, en utilisant différents ratios de phrases de langue générale et spécialisée. En effet, nous avons un volume de phrases assez élevé pour décrire la simplification de la langue générale. Cependant, ces phrases ne décrivent pas bien les transformations requises pour implifierla langue médicale. Les phrases parallèles provenant du domaine biomédical permettent donc de combler cette limite. Nous utilisons aussi un lexique qui apparie des termes médicaux complexes avec des paraphrases accessibles au grand public (7 580 paraphrases pour 4 516 termes médicaux). Nous pouvons ainsi mener trois séries d’expériences 1. le lexique de paraphrases n’est pas utilisé et la simplification est uniquement basée sur les exemples provenant des corpus d’entraînement ;2. le lexique est exploité lors de la phase de simplification, où il sert à indiquer au modèlecomment remplacer les termes inconnus, qui se trouvent dans le lexique de paraphrases 3. le lexique est exploité lors de la phase d’entraînement, où il est ajouté à l’ensemble d’entraînement, ce qui permet de compléter les données des corpus.Nous évaluons les résultats avec les métriques BLEU (Papineni et al., 2002), SARI (Xu et al., 2016) et Kandel (Kandel & Moles, 1958). Globalement, les résultats indiquent que des données spécialisées, même en petite quantité, aident significativement la simplification
	Keyword: [INFO]Computer Science [cs]; automatic text simplification; biomedical domain; domaine biomédical; simplification automatique de textes
	URL: https://hal.archives-ouvertes.fr/hal-03509735
	BASE
	Hide details

2	Disambiguation of Medical Abbreviations in French with Supervised Methods
	Koptient, Anaïs; Grabar, Natalia
	In: Studies in Health Technology and Informatics ; https://hal.archives-ouvertes.fr/hal-03335532 ; Studies in Health Technology and Informatics, 2021, ⟨10.3233/shti210171⟩ (2021)
	BASE
	Show details

3	Fine-Grained Simplification of Medical Documents
	Koptient, Anaïs; Londres, Muriel; Grabar, Natalia
	In: Studies in Health Technology and Informatics ; https://hal.archives-ouvertes.fr/hal-03335524 ; Studies in Health Technology and Informatics, 281, pp.308-312, 2021, ⟨10.3233/shti210170⟩ (2021)
	BASE
	Show details

4	Simplification automatique de textes biomédicaux en français: lorsque des données précises de petite taille aident
	Cardon, Rémi; Grabar, Natalia
	In: Actes de la 28e Conférence sur le Traitement Automatique des Langues Naturelles. Volume 1 : conférence principale ; Traitement Automatique des Langues Naturelles ; https://hal.archives-ouvertes.fr/hal-03265887 ; Traitement Automatique des Langues Naturelles, 2021, Lille, France. pp.275-277 (2021)
	BASE
	Show details

5	Modals as predictive factor for L2 proficiency level
	Grabar, Natalia; Cappelle, Bert; Depraetere, Ilse...
	In: ISLE (International society for the Linguistics of English) ; https://hal.archives-ouvertes.fr/hal-03558793 ; ISLE (International society for the Linguistics of English), Jun 2021, Joensuu, Finland (2021)
	BASE
	Show details

6	Parallel sentence alignment from biomedical comparable corpora
	Cardon, Rémi; Grabar, Natalia
	In: Studies in Health Technology and Informatics ; https://hal.archives-ouvertes.fr/hal-03095183 ; Studies in Health Technology and Informatics, 270, pp.362-366, 2020, ⟨10.3233/SHTI200183⟩ (2020)
	BASE
	Show details

7	Construction d'un corpus parallèle à partir de corpus comparables pour la simplification de textes médicaux en français
	Cardon, Rémi; Grabar, Natalia
	In: ISSN: 1248-9433 ; EISSN: 1965-0906 ; Revue TAL ; https://hal.archives-ouvertes.fr/hal-03094983 ; Revue TAL, ATALA (Association pour le Traitement Automatique des Langues), 2020 (2020)
	BASE
	Show details

8	Typologie de transformations dans la simplification de textes
	Koptient, Anaïs; Grabar, Natalia
	In: Congrès mondial de la linguistique française ; https://hal.archives-ouvertes.fr/hal-03095235 ; Congrès mondial de la linguistique française, Jul 2020, Montpellier, France (2020)
	BASE
	Show details

9	Fine-grained text simplification in French: steps towards a better grammaticality
	Koptient, Anaïs; Grabar, Natalia
	In: ISHIMR Proceedings of the 18th International Symposium on Health Information Management Research ; https://hal.archives-ouvertes.fr/hal-03095247 ; ISHIMR Proceedings of the 18th International Symposium on Health Information Management Research, Sep 2020, Kalmar, Sweden. ⟨10.15626/ishimr.2020.xxx⟩ (2020)
	BASE
	Show details

10	Rated Lexicon for the Simplification of Medical Texts
	Koptient, Anaïs; Grabar, Natalia
	In: The Fifth International Conference on Informatics and Assistive Technologies for Health-Care, Medical Support and Wellbeing HEALTHINFO 2020 ; https://hal.archives-ouvertes.fr/hal-03095275 ; The Fifth International Conference on Informatics and Assistive Technologies for Health-Care, Medical Support and Wellbeing HEALTHINFO 2020, Oct 2020, Porto, Portugal (2020)
	BASE
	Show details

11	French Biomedical Text Simplification: When Small and Precise Helps
	Cardon, Rémi; Grabar, Natalia
	In: Proceedings of the 28th International Conference on Computational Linguistics ; The 28th International Conference on Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-03095290 ; The 28th International Conference on Computational Linguistics, Dec 2020, Barcelone ( on line), Spain (2020)
	BASE
	Show details

12	RNN embeddings for identifying difficult to understand medical words
	Pylieva, Hanna; Chernodub, Artem; Grabar, Natalia...
	In: ACL Workshop on Biomedical Natural Language Processing ; https://hal.archives-ouvertes.fr/hal-02371219 ; ACL Workshop on Biomedical Natural Language Processing, Aug 2019, Florence, Italy (2019)
	BASE
	Show details

13	Speculation and negation detection in french biomedical corpora
	Dalloux, Clément; Claveau, Vincent; Grabar, Natalia
	In: RANLP 2019 - Recent Advances in Natural Language Processing ; https://hal.archives-ouvertes.fr/hal-02284444 ; RANLP 2019 - Recent Advances in Natural Language Processing, Sep 2019, Varna, Bulgaria. pp.1-10 (2019)
	BASE
	Show details

14	Parallel sentence retrieval from comparable corpora for biomedical text simplification
	Cardon, Rémi; Grabar, Natalia
	In: Proceedings of Recent Advances in Natural Language Processing ; RANLP 2019 ; https://hal.archives-ouvertes.fr/hal-02430458 ; RANLP 2019, Sep 2019, Varna, Bulgaria (2019)
	BASE
	Show details

15	Automatic detection of parallel sentences from comparable biomedical texts
	Cardon, Rémi; Grabar, Natalia
	In: CICLING 2019 ; https://hal.archives-ouvertes.fr/hal-02430419 ; CICLING 2019, Apr 2019, La Rochelle, France (2019)
	BASE
	Show details

16	Simplification-induced transformations: typology and some characteristics
	Koptient, Anaïs; Cardon, Rémi; Grabar, Natalia
	In: Proceedings of the 18th BioNLP Workshop and Shared Task ; BioNLP 2019 ; https://hal.archives-ouvertes.fr/hal-02430514 ; BioNLP 2019, Aug 2019, Florence, Italy. ⟨10.18653/v1/W19-5033⟩ (2019)
	BASE
	Show details

17	Annotations d'entités et de relations sur des résumés d'articles scientifiques pour la détection d'interactions entre aliments et médicaments
	Randriatsitohaina, Tsanta; Grouin, Cyril; Bedouch, Pierrick...
	In: TALMED 2019 ; https://hal.archives-ouvertes.fr/hal-02430510 ; TALMED 2019, Aug 2019, Lyon, France (2019)
	BASE
	Show details

18	Construire une première base de connaissance du patrimoine textile de la région Hauts-de-France
	Kergosien, Eric; Wybo, Mathilde; Cardon, Rémi...
	In: 21ème édition du Colloque International sur le Document numériquE - CiDE.21 “ Économie(s) du document : pratiques et prospectives liées au numérique ” ; https://hal.archives-ouvertes.fr/hal-02561080 ; 21ème édition du Colloque International sur le Document numériquE - CiDE.21 “ Économie(s) du document : pratiques et prospectives liées au numérique ”, Apr 2019, Djerba, Tunisie (2019)
	BASE
	Show details

19	Query selection methods for automated corpora construction with a use case in food-drug interactions
	Bordea, Georgeta; Randriatsitohaina, Tsanta; Grabar, Natalia...
	In: ACL Workshop on Biomedical Natural Language Processing ; https://hal.archives-ouvertes.fr/hal-02371207 ; ACL Workshop on Biomedical Natural Language Processing, Aug 2019, Florence, Italy. pp.115-124, ⟨10.18653/v1/W19-5013⟩ (2019)
	BASE
	Show details

20	WikiWars-UA : Ukrainian corpus annotated with temporal expressions
	Grabar, Natalia; Hamon, Thierry
	In: Computational Linguistics and Intelligent Systems ; https://hal.archives-ouvertes.fr/hal-02371237 ; Computational Linguistics and Intelligent Systems, Apr 2019, Kharkiv, Ukraine (2019)
	BASE
	Show details

Page: 1 2 3 4 5

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern