1 |
Combining NLP and probabilistic categorisation fordocument and term selection for Swiss-Prot medical annotation
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Multiview Semi-Supervised Learning for Ranking Multilingual Documents
|
|
|
|
In: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases ; https://hal.archives-ouvertes.fr/hal-01286156 ; European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Sep 2011, Athens, Greece. pp.443-458, ⟨10.1007/978-3-642-23808-6_29⟩ (2011)
|
|
BASE
|
|
Show details
|
|
3 |
A Co-classification Approach to Learning from Multilingual Corpora
|
|
|
|
In: ISSN: 0885-6125 ; EISSN: 1573-0565 ; Machine Learning ; https://hal.archives-ouvertes.fr/hal-01172633 ; Machine Learning, Springer Verlag, 2010, 79 (1-2), pp.105-121. ⟨10.1007/s10994-009-5151-5⟩ (2010)
|
|
BASE
|
|
Show details
|
|
4 |
Multiview Clustering of Multilingual Documents
|
|
|
|
In: Proceedings of the 33rd Annual ACM SIGIR Conference (SIGIR 2010) ; The 33rd Annual ACM SIGIR Conference (SIGIR 2010) ; https://hal.archives-ouvertes.fr/hal-01292100 ; The 33rd Annual ACM SIGIR Conference (SIGIR 2010), Jul 2010, Geneva, Switzerland. pp.812-822, ⟨10.1145/1835449.1835633⟩ (2010)
|
|
BASE
|
|
Show details
|
|
5 |
Combining Coregularization and Consensus-Based Self-Training for Multilingual Text Categorization
|
|
|
|
In: Proceedings of the 33rd Annual ACM SIGIR Conference (SIGIR 2010) ; The 33rd Annual ACM SIGIR Conference (SIGIR 2010) ; https://hal.archives-ouvertes.fr/hal-01291883 ; The 33rd Annual ACM SIGIR Conference (SIGIR 2010), Jul 2010, Geneva, Switzerland. pp.475-482, ⟨10.1145/1835449.1835529⟩ (2010)
|
|
BASE
|
|
Show details
|
|
7 |
Learning from Multiple Partially Observed Views -- an Application to Multilingual Text Categorization
|
|
|
|
In: Advances in Neural Information Processing Systems ; https://hal.archives-ouvertes.fr/hal-01297947 ; Advances in Neural Information Processing Systems, Dec 2009, Vancouver, Canada (2009)
|
|
BASE
|
|
Show details
|
|
10 |
Combining NLP and probabilistic categorisation for document and term selection for Swiss-Prot medical annotation
|
|
|
|
Abstract:
Motivation: Searching relevant publications for manual database annotation is a tedious task. In this paper, we apply a combination of Natural Language Processing (NLP) and probabilistic classification to re-rank documents returned by PubMed according to their relevance to Swiss-Prot annotation, and to identify significant terms in the documents. Results: With a Probabilistic Latent Categoriser (PLC) we obtained 69% recall and 59% precision for relevant documents in a representative query. As the PLC technique provides the relative contribution of each term to the final document score, we used the Kullback-Leibler symmetric divergence to determine the most discriminating words for Swiss-Prot medical annotation. This information should allow curators to understand classification results better. It also has great value for fine-tuning the linguistic pre-processing of documents, which in turn can improve the overall classifier performance. Availability: The medical annotation dataset is available from the authors upon request Contact: Pavel.Dobrokhotov@isb-sib.ch ; Cyril.Goutte@xrce.xerox.com * To whom correspondence should be addressed.
|
|
Keyword:
ORIGINAL PAPERS
|
|
URL: http://bioinformatics.oxfordjournals.org/cgi/content/short/19/suppl_1/i91 https://doi.org/10.1093/bioinformatics/btg1011
|
|
BASE
|
|
Hide details
|
|
|
|