A tree-based inverted file for fast ranked-document retrieval

Wann Yun Shieh*, Tien Fu Chen, Chung Ping Chung

*此作品的通信作者

研究成果: 圖書/報告稿件的類型會議稿件同行評審

2 引文 斯高帕斯(Scopus)

摘要

Inverted files are widely used to index documents in large-scale information retrieval systems. An inverted file consists of posting lists, which can be stored in either a document-identifier ascending order or a document-weight descending order. For an identifier-ascending-order posting list, retrieving ranked documents necessitates traversal of all postings, whereas for the weight-descending-order posting list, performing Boolean queries involves very complex processing. In this paper, we transform a posting list to a tree-based structure, called the n-key-heap posting tree, to speedup ranked-document retrieval for Boolean queries. In this structure, the orders of document identifiers and document weights are preserved simultaneously. To preserve the identifier order, the edge pointers are designed to maintain numerical order in the posting tree. To preserve the weight order, greater-weight postings are stored in higher tree nodes by the heap property. We model these criteria to a tree-construction problem and propose an efficient algorithm to construct an optimal posting tree having the minimal access time.

原文英語
主出版物標題Proceedings of the International Conference on Information and Knowledge Engineering 2003
編輯N. Goharian, N. Goharian
頁面64-69
頁數6
出版狀態已出版 - 2003
對外發佈
事件Proceedings of the International Conference on Information and Knowledge Engineering 2003 - Las Vegas, NV, 美國
持續時間: 23 06 200326 06 2003

出版系列

名字Proceedings of the International Conference on Information and Knowledge Engineering
1

Conference

ConferenceProceedings of the International Conference on Information and Knowledge Engineering 2003
國家/地區美國
城市Las Vegas, NV
期間23/06/0326/06/03

指紋

深入研究「A tree-based inverted file for fast ranked-document retrieval」主題。共同形成了獨特的指紋。

引用此