1 |
Selbstreparaturen in der schriftlichen Interaktion : eine kontrastive Analyse deutscher und russischer Kurznachrichtenkommunikation
|
|
|
|
BLLDB
|
|
UB Frankfurt Linguistik
|
|
Show details
|
|
2 |
Multilingual Email Zoning - Segmenting Multilingual Email Text Into Zones
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Books of Hours: the First Liturgical Corpus for Text Segmentation
|
|
|
|
In: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020) ; 12th Language Resources and Evaluation Conference ; https://hal.archives-ouvertes.fr/hal-02931294 ; 12th Language Resources and Evaluation Conference, May 2020, Marseille (Virtual), France. pp.776-784 (2020)
|
|
Abstract:
International audience ; The Book of Hours was the bestseller of the late Middle Ages and Renaissance. It is a historical invaluable treasure, documentingthe devotional practices of Christians in the late Middle Ages. Up to now, its textual content has been scarcely studied because of itsmanuscript nature, its length and its complex content. At first glance, it looks too standardized. However, the study of book of hoursraises important challenges: (i) in image analysis, its often lavish ornamentation (illegible painted initials, line-fillers, etc.), abbreviatedwords, multilingualism are difficult to address in Handwritten Text Recognition (HTR); (ii) its hierarchical entangled structure offers anew field of investigation for text segmentation; (iii) in digital humanities, its textual content gives opportunities for historical analysis.In this paper, we provide the first corpus of books of hours, which consists of Latin transcriptions of 300 books of hours generated byHandwritten Text Recognition (HTR) - that is like Optical Character Recognition (OCR) but for handwritten and not printed texts. Wedesigned a structural scheme of the book of hours and annotated manually two books of hours according to this scheme. Lastly, weperformed a systematic evaluation of the main state of the art text segmentation approache
|
|
Keyword:
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]; [INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]; [SHS.HIST]Humanities and Social Sciences/History; [SHS.LANGUE]Humanities and Social Sciences/Linguistics; [SHS.MUSEO]Humanities and Social Sciences/Cultural heritage and museology; books of hours; hierarchical segmentation; structural scheme; text segmentation
|
|
URL: https://hal.archives-ouvertes.fr/hal-02931294/file/2020.lrec-1.97.pdf https://hal.archives-ouvertes.fr/hal-02931294/document https://hal.archives-ouvertes.fr/hal-02931294
|
|
BASE
|
|
Hide details
|
|
8 |
Hierarchical Text Segmentation for Medieval Manuscripts
|
|
|
|
In: COLING'2020 The 28th International Conference on Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-03100170 ; COLING'2020 The 28th International Conference on Computational Linguistics, Dec 2020, Barcelona, Spain. pp.6240-6251 ; https://www.aclweb.org/anthology/2020.coling-main.549.pdf (2020)
|
|
BASE
|
|
Show details
|
|
13 |
Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems
|
|
BAYOMI, MOSTAFA MOHAMED. - : Trinity College Dublin. School of Computer Science & Statistics. Discipline of Computer Science, 2019
|
|
BASE
|
|
Show details
|
|
14 |
Using NLP Techniques to Enhance Content Discoverability and Reusability for Adaptive Systems
|
|
BAYOMI, MOSTAFA. - : Trinity College Dublin. School of Computer Science & Statistics. Discipline of Computer Science, 2019
|
|
BASE
|
|
Show details
|
|
16 |
Working on Сomputer-Assisted Translation platforms: New advantages and new mistakes
|
|
|
|
In: Russian journal of linguistics: Vestnik RUDN, Vol 23, Iss 2, Pp 544-561 (2019) (2019)
|
|
BASE
|
|
Show details
|
|
|
|