Catalogue search • Linguistik portal • Fachinformationsdienst (FID)

1	Efficient in-memory top-k document retrieval
	Matthias Petri; Falk Scholer
	In: http://goanna.cs.rmit.edu.au/~e76763/publications/cps12-sigir.pdf (2012)
	Abstract: For over forty years the dominant data structure for ranked document retrieval has been the inverted index. Inverted indexes are effective for a variety of document retrieval tasks, and particularly efficient for large data collection scenarios that require disk access and storage. However, many efficiency-bound search tasks can now easily be supported entirely in-memory as a result of recent hardware advances. In this paper we present a hybrid algorithmic framework for inmemory bag-of-words ranked document retrieval using a self-index derived from the FM-Index, wavelet tree, and the compressed suffix tree data structures, and evaluate the various algorithmic trade-offs for performing efficient queries entirely in-memory. We compare our approach with two classic approaches to bag-of-words queries using inverted indexes, term-at-a-time (TAAT) and document-at-atime (DAAT) query processing. We show that our framework is competitive with state-of-the-art indexing structures, and describe new capabilities provided by our algorithms that can be leveraged by future systems to improve effectiveness and efficiency for a variety of fundamental search operations.
	Keyword: Data Storage Representations; Experimentation; I.7.3 [Document and Text Processing; Measurement; Performance; query formulation; retrieval models; search process; Text Compression; Text Processing—index generation Keywords Text Indexing
	URL: http://goanna.cs.rmit.edu.au/~e76763/publications/cps12-sigir.pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.415.5098
	BASE
	Hide details

2	Language independent ranked retrieval with NeWT
	Michiko Yasukawa; Falk Scholer
	In: http://goanna.cs.rmit.edu.au/~e76763/publications/cys11-adcs.pdf (2011)
	BASE
	Show details

Search in the Catalogues and Directories