1 |
Évaluation d'une nouvelle structuration thématique hiérarchique des textes dans un cadre de résumé automatique et de détection d'ancres au sein de vidéos
|
|
|
|
In: Actes de la conférence TALN ; Conférence sur le Traitement Automatique des Langues Naturelles ; https://hal.archives-ouvertes.fr/hal-01399670 ; Conférence sur le Traitement Automatique des Langues Naturelles, 2016, Paris, France. pp.139-152 (2016)
|
|
BASE
|
|
Show details
|
|
2 |
Hierarchical topic structuring: from dense segmentation to topically focused fragments via burst analysis
|
|
|
|
In: Recent Advances on Natural Language Processing ; https://hal.archives-ouvertes.fr/hal-01186443 ; Recent Advances on Natural Language Processing, 2015, Hissar, Bulgaria (2015)
|
|
BASE
|
|
Show details
|
|
3 |
Leveraging lexical cohesion and disruption for topic segmentation
|
|
|
|
In: Proceedings of International Conference on Empirical Methods in Natural Language Processing, EMNLP 2013 ; International Conference on Empirical Methods in Natural Language Processing, EMNLP 2013 ; https://hal.archives-ouvertes.fr/hal-00867011 ; International Conference on Empirical Methods in Natural Language Processing, EMNLP 2013, Oct 2013, Seattle, United States. pp.1314--1324 (2013)
|
|
Abstract:
International audience ; Topic segmentation classically relies on one of two criteria, either finding areas with coherent vocabulary use or detecting discontinuities. In this paper, we propose a segmentation criterion combining both lexical cohesion and disruption, enabling a trade-off between the two. We provide the mathematical formulation of the criterion and an efficient graph based decoding algorithm for topic segmentation. Experimental results on standard textual data sets and on a more challenging corpus of automatically transcribed broadcast news shows demonstrate the benefit of such a combination. Gains were observed in all conditions, with segments of either regular or varying length and abrupt or smooth topic shifts. Long segments benefit more than short segments.However the algorithm has proven robust on automatic transcripts with short segments and limited vocabulary reoccurrences.
|
|
Keyword:
[INFO.INFO-TT]Computer Science [cs]/Document and Text Processing; lexical cohesion; lexical disruption; topic segmentation
|
|
URL: https://hal.archives-ouvertes.fr/hal-00867011 https://hal.archives-ouvertes.fr/hal-00867011/document https://hal.archives-ouvertes.fr/hal-00867011/file/emnlp.pdf
|
|
BASE
|
|
Hide details
|
|
4 |
Un modèle segmental probabiliste combinant cohésion lexicale et rupture lexicale pour la segmentation thématique
|
|
|
|
In: TALN - Conférence sur le traitement automatique des langues naturelles ; https://hal.inria.fr/hal-00844112 ; TALN - Conférence sur le traitement automatique des langues naturelles, ATALA, Jun 2013, Les Sables d'Olonne, France ; http://www.taln2013.org/actes/www/TALN-2013/actes/taln-2013-long-015.pdf (2013)
|
|
BASE
|
|
Show details
|
|
|
|