Abstract
In this paper, we assessed the Latent Semantic Indexing (LSI) approach for Chinese information filtering. The assessment was for Chinese news filtering agents that used a character-based and hierarchical filtering scheme. The traditional vector space model was employed as information filtering model, and each document was converted into a vector of weights of terms. Instead of using words as terms in IR denominating tradition, the terms were referred to Chinese characters. LSI captured the semantic relationship between the documents and Chinese characters. We used the Singular-value Decomposition(SVD) technique to compress the terms space into a lower dimension which achieves latent association between document and terms. We showed by experiments that the recall and precision results of Chinese news filtering by character-based approach incorporating the LSI technique into the information filtering system were satisfactory.
| Original language | English |
|---|---|
| Pages | 209-223 |
| Number of pages | 15 |
| State | Published - 1997 |
| Externally published | Yes |
Bibliographical note
Publisher Copyright:© 1997 Proceedings of the 10th Research on Computational Linguistics International Conference, ROCLING 1997. All rights reserved.
Fingerprint
Dive into the research topics of 'An assessment on character-based Chinese news filtering using latent semantic indexing'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver