2 |
Terminologies augmented recurrent neural network model for clinical named entity recognition
|
|
|
|
In: ISSN: 1532-0464 ; EISSN: 1532-0480 ; Journal of Biomedical Informatics ; https://hal.archives-ouvertes.fr/hal-02428771 ; Journal of Biomedical Informatics, Elsevier, 2020, 102, pp.103356. ⟨10.1016/j.jbi.2019.103356⟩ (2020)
|
|
Abstract:
International audience ; OBJECTIVE:We aimed to enhance the performance of a supervised model for clinical named-entity recognition (NER) using medical terminologies. In order to evaluate our system in French, we built a corpus for 5 types of clinical entities.METHODS:We used a terminology-based system as baseline, built upon UMLS and SNOMED. Then, we evaluated a biGRU-CRF, and a hybrid system using the prediction of the terminology-based system as feature for the biGRU-CRF. In French, we built APcNER, a corpus of 147 documents annotated for 5 entities (Drug names, Signs or symptoms, Diseases or disorders, Diagnostic procedures or lab tests and Therapeutic procedures). We evaluated each NER systems using exact and partial match definition of F-measure for NER. The APcNER contains 4,837 entities, which took 28 h to annotate. The inter-annotator agreement as measured by Cohen's Kappa was substantial for non-exact match (Κ = 0.61) and moderate considering exact match (Κ = 0.42). In English, we evaluated the NER systems on the i2b2-2009 Medication Challenge for Drug name recognition, which contained 8,573 entities for 268 documents, and i2b2-small a version reduced to match APcNER number of entities.RESULTS:For drug name recognition on both i2b2-2009 and APcNER, the biGRU-CRF performed better that the terminology-based system, with an exact-match F-measure of 91.1% versus 73% and 81.9% versus 75% respectively. For i2b2-small and APcNER, the hybrid system outperformed the biGRU-CRF, with an exact-match F-measure of 87.8% versus 85.6% and 86.4% versus 81.9% respectively. On APcNER corpus, the micro-average F-measure of the hybrid system on the 5 entities was 69.5% in exact match and 84.1% in non-exact match.CONCLUSION:APcNER is a French corpus for clinical-NER of five types of entities which covers a large variety of document types. The extension of the supervised model with terminology has allowed an easy increase in performance, especially for rare entities, and established near state of the art results on the i2b2-2009 corpus.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; APcNER; Clinical natural language processing; Information extraction; Machine learning; Named entity recognition
|
|
URL: https://hal.archives-ouvertes.fr/hal-02428771 https://doi.org/10.1016/j.jbi.2019.103356
|
|
BASE
|
|
Hide details
|
|
3 |
CAS: corpus of clinical cases in French
|
|
|
|
In: ISSN: 2041-1480 ; Journal of Biomedical Semantics ; https://hal.archives-ouvertes.fr/hal-03021064 ; Journal of Biomedical Semantics, BioMed Central, 2020, ⟨10.1186/s13326-020-00225-x⟩ (2020)
|
|
BASE
|
|
Show details
|
|
4 |
Language as a Cognitive Tool to Imagine Goals in Curiosity-Driven Exploration
|
|
|
|
In: NeurIPS 2020 - 34th Conference on Neural Information Processing Systems ; https://hal.archives-ouvertes.fr/hal-03083158 ; NeurIPS 2020 - 34th Conference on Neural Information Processing Systems, Dec 2020, Vancouver / Virtual, Canada (2020)
|
|
BASE
|
|
Show details
|
|
5 |
Normalisation of 16th and 17th century texts in French and geographical named entity recognition
|
|
|
|
In: 4th ACM SIGSPATIAL International Workshop on Geospatial Humanities ; ACM SIGSPATIAL GeoHumanities'20 ; https://hal-upec-upem.archives-ouvertes.fr/hal-02955867 ; ACM SIGSPATIAL GeoHumanities'20, ACM, Nov 2020, Seattle (virtual), United States. pp.28-34, ⟨10.1145/3423337.3429437⟩ ; https://ludovicmoncla.github.io/sigspatial-geohumanities-2020/ (2020)
|
|
BASE
|
|
Show details
|
|
6 |
FlauBERT: Unsupervised Language Model Pre-training for French
|
|
|
|
In: Proceedings of the 12th Language Resources and Evaluation Conference ; LREC ; https://hal.archives-ouvertes.fr/hal-02890258 ; LREC, 2020, Marseille, France (2020)
|
|
BASE
|
|
Show details
|
|
7 |
The Impact of Specialized Corpora for Word Embeddings in Natural Langage Understanding.
|
|
|
|
In: ISSN: 0926-9630 ; EISSN: 1879-8365 ; Studies in Health Technology and Informatics ; https://hal.inria.fr/hal-03476839 ; Studies in Health Technology and Informatics, IOS Press, 2020, 270, pp.432-436. ⟨10.3233/SHTI200197⟩ (2020)
|
|
BASE
|
|
Show details
|
|
8 |
Evaluation of Embeddings in Medication Domain for Spanish Language Using Joint Natural Language Understanding
|
|
|
|
In: IFMBE Proceedings ; https://hal.archives-ouvertes.fr/hal-03294349 ; IFMBE Proceedings, 2020, 8th European Medical and Biological Engineering Conference, 80, pp.510 - 517. ⟨10.1007/978-3-030-64610-3_58⟩ (2020)
|
|
BASE
|
|
Show details
|
|
9 |
Improving Short Text Classification Through Global Augmentation Methods
|
|
|
|
In: Lecture Notes in Computer Science ; 4th International Cross-Domain Conference for Machine Learning and Knowledge Extraction (CD-MAKE) ; https://hal.inria.fr/hal-03414750 ; 4th International Cross-Domain Conference for Machine Learning and Knowledge Extraction (CD-MAKE), Aug 2020, Dublin, Ireland. pp.385-399, ⟨10.1007/978-3-030-57321-8_21⟩ (2020)
|
|
BASE
|
|
Show details
|
|
10 |
A User-Centric and Sentiment Aware Privacy-Disclosure Detection Framework Based on Multi-Input Neural Network
|
|
|
|
In: Computer Science Faculty Publications and Presentations (2020)
|
|
BASE
|
|
Show details
|
|
12 |
Inference Annotation of a Chinese Corpus for Opinion Mining
|
|
|
|
In: LREC ; https://hal-inalco.archives-ouvertes.fr/hal-02507170 ; LREC, May 2020, Marseille, France (2020)
|
|
BASE
|
|
Show details
|
|
13 |
Lexicon-Grammar based open information extraction from natural language sentences in Italian
|
|
|
|
In: ISSN: 0957-4174 ; Expert Systems with Applications ; https://hal.archives-ouvertes.fr/hal-02291746 ; Expert Systems with Applications, Elsevier, 2020, pp.112954. ⟨10.1016/j.eswa.2019.112954⟩ (2020)
|
|
BASE
|
|
Show details
|
|
14 |
Syntactic and Semantic Impact of Prepositions in Machine Translation : An Empirical Study of French-English Translation of Prepositions ‘à’, ‘de’ and ‘en’
|
|
|
|
In: Human Language Technology. Challenges for Computer Science and Linguistics 8th Language and Technology Conference, LTC 2017, Poznań, Poland, November 17–19, 2017, Revised Selected Papers ; 8th Language and Technology Conference (LTC 2017) ; https://hal-lirmm.ccsd.cnrs.fr/lirmm-03091307 ; Human Language Technology. Challenges for Computer Science and Linguistics 8th Language and Technology Conference, LTC 2017, Poznań, Poland, November 17–19, 2017, Revised Selected Papers, 12598, pp.273-287, 2020, Lecture Notes in Computer Science, 978-3-030-66526-5. ⟨10.1007/978-3-030-66527-2_20⟩ (2020)
|
|
BASE
|
|
Show details
|
|
15 |
State Machine based Human-Bot Conversation Model and Services
|
|
|
|
In: CAiSE 2020: 32nd International Conference on Advanced Information Systems Engineering ; https://hal.archives-ouvertes.fr/hal-03122974 ; CAiSE 2020: 32nd International Conference on Advanced Information Systems Engineering, Jun 2020, Grenoble, France. pp.199-214, ⟨10.1007/978-3-030-49435-3_13⟩ (2020)
|
|
BASE
|
|
Show details
|
|
16 |
Place perception from the fusion of different image representation
|
|
|
|
In: Li, P, Li, X, Li, X, Pan, H, Khyam, MO, Noor-A-Rahim, M, Ge, SS, (2020). Place perception from the fusion of different image representation. Pattern Recognition, Vol. 110, p. 1-11 http://dx.doi.org/10.1016/j.patcog.2020.107680 (2020)
|
|
BASE
|
|
Show details
|
|
18 |
MultiMWE: building a multi-lingual multi-word expression (MWE) parallel corpora
|
|
|
|
In: Han, Lifeng orcid:0000-0002-3221-2185 , Jones, Gareth J.F. orcid:0000-0003-2923-8365 and Smeaton, Alan F. orcid:0000-0003-1028-8389 (2020) MultiMWE: building a multi-lingual multi-word expression (MWE) parallel corpora. In: 12th International Conference on Language Resources and Evaluation (LREC), 11-16 May, 2020, Marseille, France. (Virtual). (2020)
|
|
BASE
|
|
Show details
|
|
19 |
MultiMWE: building a multi-lingual multi-word expression (MWE) Pparallel corpora
|
|
|
|
In: Han, Lifeng, Gareth, Jones orcid:0000-0003-2923-8365 and Alan, Smeaton orcid:0000-0003-1028-8389 (2020) MultiMWE: building a multi-lingual multi-word expression (MWE) Pparallel corpora. In: International Conference on Language Resources and Evaluation (LREC), 11-16 May, 2020, Marseille, France. (2020)
|
|
BASE
|
|
Show details
|
|
20 |
Using Twitter Streams for Opinion Mining: a case study on Airport Noise
|
|
|
|
In: ISSN: 1865-0929 ; Communications in Computer and Information Science ; https://hal.archives-ouvertes.fr/hal-03018998 ; Communications in Computer and Information Science, Springer Verlag, 2020, ⟨10.1007/978-3-030-44900-1_10⟩ (2020)
|
|
BASE
|
|
Show details
|
|
|
|