1 |
Assessing the impact of OCR noise on multilingual event detection over digitised documents
|
|
|
|
In: ISSN: 1432-5012 ; EISSN: 1432-1300 ; International Journal on Digital Libraries ; https://hal.archives-ouvertes.fr/hal-03635985 ; International Journal on Digital Libraries, Springer Verlag, 2022, ⟨10.1007/s00799-022-00325-2⟩ (2022)
|
|
BASE
|
|
Show details
|
|
2 |
Computational Measures of Deceptive Language: Prospects and Issues
|
|
|
|
In: ISSN: 2297-900X ; EISSN: 2297-900X ; Frontiers in Communication ; https://hal.archives-ouvertes.fr/hal-03629780 ; Frontiers in Communication, Frontiers, 2022, 7, pp.792378. ⟨10.3389/fcomm.2022.792378⟩ (2022)
|
|
BASE
|
|
Show details
|
|
3 |
Multiword Expression Features for Automatic Hate Speech Detection
|
|
|
|
In: NLDB 2021 - 26th International Conference on Natural Language & Information Systems ; https://hal.archives-ouvertes.fr/hal-03231047 ; NLDB 2021 - 26th International Conference on Natural Language & Information Systems, Jun 2021, Saarbrücken/Virtual, Germany ; http://nldb2021.sb.dfki.de/ (2021)
|
|
BASE
|
|
Show details
|
|
4 |
Hate speech and offensive language detection using transfer learning approaches ; Détection du discours de haine et du langage offensant utilisant des approches de Transfer Learning
|
|
|
|
In: https://tel.archives-ouvertes.fr/tel-03276023 ; Document and Text Processing. Institut Polytechnique de Paris, 2021. English. ⟨NNT : 2021IPPAS007⟩ (2021)
|
|
BASE
|
|
Show details
|
|
5 |
A Multilingual Dataset for Named Entity Recognition, Entity Linking and Stance Detection in Historical Newspapers
|
|
|
|
In: SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval ; https://hal.archives-ouvertes.fr/hal-03418387 ; SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Jul 2021, Virtual Event, Canada. pp.2328-2334, ⟨10.1145/3404835.3463255⟩ (2021)
|
|
BASE
|
|
Show details
|
|
6 |
Impact Analysis of Document Digitization on Event Extraction ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Impact Analysis of Document Digitization on Event Extraction ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Fine-Grained Implicit Sentiment in Financial News: Uncovering Hidden Bulls and Bears
|
|
|
|
In: Electronics ; Volume 10 ; Issue 20 (2021)
|
|
BASE
|
|
Show details
|
|
9 |
Analyzing Non-Textual Content Elements to Detect Academic Plagiarism
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Cross language plagiarism detection with contextualized word embeddings ; Detecção de plágio multilíngue usando word embeddings contextualizadas
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Application-Oriented Approach for Detecting Cyberaggression in Social Media
|
|
|
|
In: International Conference on Applied Human Factors and Ergonomics ; https://hal.archives-ouvertes.fr/hal-02903422 ; International Conference on Applied Human Factors and Ergonomics, Jul 2020, San Diego, United States. pp.129-136, ⟨10.1007/978-3-030-51328-3_19⟩ ; https://link.springer.com/chapter/10.1007%2F978-3-030-51328-3_19 (2020)
|
|
BASE
|
|
Show details
|
|
13 |
Affective behavior modeling on social networks ; Modélisation des sentiments sur les réseaux sociaux
|
|
|
|
In: https://tel.archives-ouvertes.fr/tel-03339755 ; Social and Information Networks [cs.SI]. Université Montpellier, 2020. English. ⟨NNT : 2020MONTS073⟩ (2020)
|
|
BASE
|
|
Show details
|
|
14 |
Impact Analysis of Document Digitization on Event Extraction
|
|
|
|
In: CEUR Workshop Proceedings ; 4th Workshop on Natural Language for Artificial Intelligence (NL4AI 2020) co-located with the 19th International Conference of the Italian Association for Artificial Intelligence (AI*IA 2020) ; https://hal.archives-ouvertes.fr/hal-03026148 ; 4th Workshop on Natural Language for Artificial Intelligence (NL4AI 2020) co-located with the 19th International Conference of the Italian Association for Artificial Intelligence (AI*IA 2020), Nov 2020, Virtual, Italy. pp.17-28 ; http://sag.art.uniroma2.it/NL4AI/ (2020)
|
|
BASE
|
|
Show details
|
|
15 |
Detecting deviations from activities of daily living routines using kinect depth maps and power consumption data
|
|
|
|
In: Research outputs 2014 to 2021 (2020)
|
|
BASE
|
|
Show details
|
|
18 |
Sequence Covering for Efficient Host-Based Intrusion Detection
|
|
|
|
In: ISSN: 1556-6013 ; IEEE Transactions on Information Forensics and Security ; https://hal.archives-ouvertes.fr/hal-01653650 ; IEEE Transactions on Information Forensics and Security, Institute of Electrical and Electronics Engineers, 2019, 14 (4), pp.994-1006. ⟨10.1109/TIFS.2018.2868614⟩ ; https://ieeexplore.ieee.org/document/8454473 (2019)
|
|
BASE
|
|
Show details
|
|
19 |
StoryMiner: An Automated and Scalable Framework for Story Analysis and Detection from Social Media
|
|
|
|
BASE
|
|
Show details
|
|
20 |
A novel framework for biomedical entity sense induction
|
|
|
|
In: ISSN: 1532-0464 ; EISSN: 1532-0480 ; Journal of Biomedical Informatics ; https://hal-lirmm.ccsd.cnrs.fr/lirmm-01851988 ; Journal of Biomedical Informatics, Elsevier, 2018, 84, pp.31-41. ⟨10.1016/j.jbi.2018.06.007⟩ (2018)
|
|
Abstract:
International audience ; Background: Rapid advancements in biomedical research have accelerated the number of relevant electronic documents published online, ranging from scholarly articles to news, blogs, and user-generated social media content. Nevertheless, the vast amount of this information is poorly organized, making it difficult to navigate. Emerging technologies such as ontologies and knowledge bases (KBs) could help organize and track the information associated with biomedical research developments. A major challenge in the automatic construction of ontologies and KBs is the identification of words with its respective sense(s) from a free-text corpus. Word-sense induction (WSI) is a task to automatically induce the different senses of a target word in the This paper is a significant extension of our previous studies published in the proceedings of: 1) The 10th International Conference on Language Resources and Evaluation (LREC'2016) titled, " Automatic biomedical term polysemy detection " ; and 2) The 19th International Conference on Extending Database Technology (EDBT'2016) titled, " A way to automatically enrich biomedical ontologies " , with new scientific methodologies and results. 2 Juan Antonio Lossio-Ventura et al. different contexts. In the last two decades, there have been several efforts on WSI. However, few methods are effective in biomedicine and life sciences. Methods: We developed a framework for biomedical entity sense induction using a mixture of natural language processing, supervised, and unsupervised learning methods with promising results. It is composed of three main steps: 1) a polysemy detection method to determine if a biomedical entity has many possible meanings; 2) a clustering quality index-based approach to predict the number of senses for the biomedical entity; and 3) a method to induce the concept(s) (i.e., senses) of the biomedical entity in a given context. Results: To evaluate our framework, we used the well-known MSH WSD polysemic dataset that contains 203 annotated ambiguous biomedical entities, where each entity is linked to 2 to 5 concepts. Our polysemy detection method obtained an F-measure of 98%. Second, our approach for predicting the number of senses achieved an F-measure of 93%. Finally, we induced the concepts of the biomedical entities based on a clustering algorithm and then extracted the keywords of reach cluster to represent the concept. Conclusions: We have developed a framework for biomedical entity sense induction with promising results. Our study results can benefit a number of downstream applications, for example, help to resolve concept ambiguities when building Semantic Web KBs from biomedical text.
|
|
Keyword:
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]; [INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]; [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]; [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing; [INFO.INFO-WB]Computer Science [cs]/Web; Biomedicine; BioNLP; Classification; Clustering; Number of cluster prediction; Polysemy detection; Word sense induction
|
|
URL: https://hal-lirmm.ccsd.cnrs.fr/lirmm-01851988v2/document https://hal-lirmm.ccsd.cnrs.fr/lirmm-01851988 https://hal-lirmm.ccsd.cnrs.fr/lirmm-01851988v2/file/Article-JBI-Lossio-2018_YJBIN_2996_final.pdf https://doi.org/10.1016/j.jbi.2018.06.007
|
|
BASE
|
|
Hide details
|
|
|
|