Factors affecting text mining based stock prediction: Text feature representations, machine learning models, and news platforms

Wei Chao Lin, Chih Fong Tsai*, Hsuan Chen

*此作品的通信作者

研究成果: 期刊稿件文章同行評審

18 引文 斯高帕斯(Scopus)

摘要

Text mining techniques have demonstrated their effectiveness for stock market prediction and different text feature representation approaches, (e.g., TF–IDF and word embedding), have been adapted to extract textual information from financial news sources. In addition, different machine learning techniques including deep learning have been employed to construct the prediction models. Various combinations of text feature representations and learning models have been applied for stock prediction, but it is unknown which performs the best or which ones can be regarded as the representative baselines for future research. Moreover, since the textual contents in the financial news articles published on different news platforms are somewhat different, the effect of using different news platforms may have an impact on prediction performance so this is also examined in the experiments comparing eight different combinations comprised of two context-free and two contextualized text feature representations, i.e. TF–IDF, Word2vec, ELMo, and BERT, and three learning techniques, i.e. SVM, CNN, and LSTM. The experimental results show that CNN+Word2vec and CNN+BERT perform the best. The textual material is taken from three public news platforms including Reuters, CNBC, and The Motley Fool. We found that the learning models constructed and the news platforms used can certainly affect the prediction of stock prices between different companies.

原文英語
文章編號109673
期刊Applied Soft Computing Journal
130
DOIs
出版狀態已出版 - 11 2022

文獻附註

Publisher Copyright:
© 2022

指紋

深入研究「Factors affecting text mining based stock prediction: Text feature representations, machine learning models, and news platforms」主題。共同形成了獨特的指紋。

引用此