DE eng

Search in the Catalogues and Directories

Hits 1 – 11 of 11

1
The ParlaMint corpora of parliamentary proceedings
BASE
Show details
2
The ParlaMint corpora of parliamentary proceedings
In: Lang Resour Eval (2022)
BASE
Show details
3
Novel database design for extreme scale corpus analysis ...
Coole, Matthew. - : Lancaster University, 2021
BASE
Show details
4
Multilingual comparable corpora of parliamentary debates ParlaMint 2.1
BASE
Show details
5
Linguistically annotated multilingual comparable corpora of parliamentary debates ParlaMint.ana 2.1
BASE
Show details
6
Linguistically annotated multilingual comparable corpora of parliamentary debates ParlaMint.ana 2.0
BASE
Show details
7
Multilingual comparable corpora of parliamentary debates ParlaMint 2.0
BASE
Show details
8
Novel database design for extreme scale corpus analysis
Coole, Matthew. - : Lancaster University, 2021
Abstract: This thesis presents the patterns and methods uncovered in the development of a new scalable corpus database management system, LexiDB, which can handle the ever-growing size of modern corpus datasets. Initially, an exploration of existing corpus data systems is conducted which examines their usage in corpus linguistics as well as their underlying architectures. From this survey, it is identified that existing systems are designed primarily to be vertically scalable (i.e. scalable through the usage of bigger, better and faster hardware). This motivates a wider examination of modern distributable database management systems and information retrieval techniques used for indexing and retrieval. These techniques are modified and adapted into an architecture that can be horizontally scaled to handle ever bigger corpora. Based on this architecture several new methods for querying and retrieval that improve upon existing techniques are proposed as modern approaches to query extremely large annotated text collections for corpus analysis. The effectiveness of these techniques and the scalability of the architecture is evaluated where it is demonstrated that the architecture is comparably scalable to two modern No-SQL database management systems and outperforms existing corpus data systems in token level pattern querying whilst still supporting character level pattern matching.
URL: https://eprints.lancs.ac.uk/id/eprint/151582/
https://eprints.lancs.ac.uk/id/eprint/151582/1/2021coolephd.pdf
https://doi.org/10.17635/lancaster/thesis/1236
BASE
Hide details
9
Unfinished Business:Construction and Maintenance of a Semantically Tagged Historical Parliamentary Corpus, UK Hansard from 1803 to the present day
Coole, Matthew; Rayson, Paul; Mariani, John. - : European Language Resources Association (ELRA), 2020
BASE
Show details
10
LexiDB: Patterns & Methods for Corpus Linguistic Database Management
Coole, Matthew; Rayson, Paul; Mariani, John. - : European Language Resources Association (ELRA), 2020
BASE
Show details
11
Infrastructure for Semantic Annotation in the Genomics Domain
El-Haj, Mahmoud; Rutherford, Nathan; Coole, Matthew. - : European Language Resources Association (ELRA), 2020
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
11
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern