1 |
Analysing discourse and text complexity for learning and collaborating ; L'analyse de la complexité du discours et du texte pour apprendre et collaborer
|
|
|
|
In: https://tel.archives-ouvertes.fr/tel-00978420 ; Education. Université de Grenoble; Universitatea politehnica (Bucarest), 2013. Français. ⟨NNT : 2013GRENH004⟩ (2013)
|
|
BASE
|
|
Show details
|
|
2 |
Pour une démarche centrée sur l'utilisateur dans les ENT. Apport au Traitement Automatique des Langues.
|
|
|
|
In: https://tel.archives-ouvertes.fr/tel-01070522 ; Sciences de l'information et de la communication. Université de Caen, 2013 (2013)
|
|
BASE
|
|
Show details
|
|
3 |
Combining an expert-based medical entity recognizer to a machine-learning system: methods and a case-study
|
|
|
|
In: Biomedical Informatics Insights ; https://hal.archives-ouvertes.fr/hal-01972779 ; Biomedical Informatics Insights, 2013, 13p (2013)
|
|
Abstract:
International audience ; Medical entity recognition is currently generally performed by data-driven methods based on supervised machine learning. Expert-based systems, where linguistic and domain expertise are directly provided to the system, for instance in the form of lexicons and pattern-based rules, are often combined with data-driven systems. We present here a case study where an existing expert-based medical entity recognition system, Ogmios, is combined with a data-driven system, Caramba, based on a linear-chain Conditional Random Field (CRF) classifier. We examine different methods to combine two such systems and test the most relevant ones through experiments performed on the i2b2/VA 2012 challenge data. Our case study specifically highlights the risk of overfitting incurred by an expert-based system. We observe that it prevents the combination of the two systems from obtaining improvements in precision, recall, or F-measure, and analyse the underlying mechanisms through a post-hoc feature-level analysis. We also observe that wrapping the expert-based system alone as attributes input to a CRF classifier does boost its F-measure from 0.603 to 0.710 (strict matching of types and boundaries, as per the conlleval program), bringing it on par with the data-driven system. The generality of this method remains to be further investigated.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO]Computer Science [cs]; Hybrid Meth- ods; Information Extraction; Machine Learning; Medical records; Natural Language Processing; Overfitting
|
|
URL: https://hal.archives-ouvertes.fr/hal-01972779
|
|
BASE
|
|
Hide details
|
|
4 |
“Facets” and “Prisms” as a Means to Achieve Pedagogical Indexation of Texts for Language Learning: Consequences of the Notion of Pedagogical Context
|
|
|
|
In: Software and Data Technologies ; https://hal.archives-ouvertes.fr/hal-01294208 ; José Cordeiro; Maria Virvou; Boris Shishkov. Software and Data Technologies, 170, Springer-Verlag, pp.253-268, 2013, Communications in Computer and Information Sciences, 978-3-642-29577-5. ⟨10.1007/978-3-642-29578-2_16⟩ ; http://link.springer.com/chapter/10.1007/978-3-642-29578-2_16 (2013)
|
|
BASE
|
|
Show details
|
|
6 |
A study on plagiarism detection and plagiarism direction identification using natural language processing techniques
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Effective active learning for complex natural language processing tasks ; Aktives Lernen für komplexe Aufgaben der Maschinellen Sprachverarbeitung
|
|
|
|
BASE
|
|
Show details
|
|
|
|