Home
Catalogue search
Refine your search:
Keyword
Creator / Publisher:
Harris, Martyn (4)
Levene, Mark (4)
Zhang, Dell (4)
Levene, D. (2)
Levene, Dan (2)
Year
Medium
Type:
Article (3)
Miscellaneous (1)
BLLDB-Access
Search in the Catalogues and Directories
All fields
Title
Creator / Publisher
Keyword
Year
AND
OR
AND NOT
All fields
Title
Creator / Publisher
Keyword
Year
AND
OR
AND NOT
All fields
Title
Creator / Publisher
Keyword
Year
AND
OR
AND NOT
All fields
Title
Creator / Publisher
Keyword
Year
AND
OR
AND NOT
All fields
Title
Creator / Publisher
Keyword
Year
Sort by
creator [A → Z]
'
creator [Z → A]
'
publishing year ↑ (asc)
'
publishing year ↓ (desc)
'
title [A → Z]
'
title [Z → A]
'
Simple Search
Hits 1 – 4 of 4
1
Comparing “parallel passages” in digital archives
Zhang, Dell
;
Levene, Mark
;
Harris, Martyn
;
Levene, D.
. - : Emerald, 2019
Abstract:
Purpose: The purpose of this paper is to present a language-agnostic approach to facilitate the discovery of “parallel passages” stored in historic and cultural heritage digital archives. Design/methodology/approach: The authors explore a novel, and relatively simple approach, using a character-based statistical language model combined with a tailored version of the Basic Local Alignment Tool to extract exact and approximate string patterns shared between groups of documents. Findings: The approach is applicable to a wide range of languages, and compensates for variability in the text of the documents as a result of differences in dialect, authorship, language change over time and errors due to inaccurate transcriptions and optical character recognition errors as a result of the digitisation process. Research limitations/implications: A number of case studies demonstrate that the approach is practical and generalisable to a wide range of archives with documents in different languages, domains and of varying quality. Practical implications: The approach described can be applied to any digital archive of modern and contemporary texts. This makes the approach applicable to digital archives recording historic texts, but also those composed of more recent news articles, for example. Social implications: The analysis of “parallel passages” enables researchers to quantify the presence and extent of text-reuse in a collection of documents, which can provide useful data on author style, text genres and cultural contexts. Originality/value: The approach is novel and addresses a need by humanities researchers for tools that can identify similar documents and local similarities represented by shared text sequences in a potentially vast large archive of documents. As far as the authors are aware, there are no tools currently exist that provide the same level of tolerance to the language of the documents.
Keyword:
Computer Science and Information Systems
URL:
https://eprints.bbk.ac.uk/id/eprint/28183/
https://doi.org/10.1108/JD-10-2018-0175
https://eprints.bbk.ac.uk/id/eprint/28183/1/28183.pdf
BASE
Hide details
2
Finding parallel passages in cultural heritage archives
Harris, Martyn
;
Levene, Mark
;
Zhang, Dell
. - : ACM, 2018
BASE
Show details
3
Finding parallel passages in cultural heritage archives
Harris, Martyn
;
Levene, Mark
;
Zhang, Dell
. - 2018
BASE
Show details
4
The Anatomy of a Search and Mining System for Digital Archives ...
Harris, Martyn
;
Levene, Mark
;
Zhang, Dell
. - : arXiv, 2016
BASE
Show details
Mobile view
All
Catalogues
UB Frankfurt Linguistik
0
IDS Mannheim
0
OLC Linguistik
0
UB Frankfurt Retrokatalog
0
DNB Subject Category Language
0
Institut für Empirische Sprachwissenschaft
0
Leibniz-Centre General Linguistics (ZAS)
0
Bibliographies
BLLDB
0
BDSL
0
IDS Bibliografie zur deutschen Grammatik
0
IDS Bibliografie zur Gesprächsforschung
0
IDS Konnektoren im Deutschen
0
IDS Präpositionen im Deutschen
0
IDS OBELEX meta
0
MPI-SHH Linguistics Collection
0
MPI for Psycholinguistics
0
Linked Open Data catalogues
Annohub
0
Online resources
Link directory
0
Journal directory
0
Database directory
0
Dictionary directory
0
Open access documents
BASE
4
Linguistik-Repository
0
IDS Publikationsserver
0
Online dissertations
0
Language Description Heritage
0
© 2013 - 2024 Lin|gu|is|tik
|
Imprint
|
Privacy Policy
|
Datenschutzeinstellungen ändern