keyboard_arrow_up
A Text Mining Research Based on LDA Topic Modelling

Authors

Zhou Tong and Haiyi Zhang, Acadia University, Canada

Abstract

A Large number of digital text information is generated every day. Effectively searching, managing and exploring the text data has become a main task. In this paper, we first represent an introduction to text mining and a probabilistic topic model Latent Dirichlet allocation. Then two experiments are proposed - Wikipedia articles and users’ tweets topic modelling. The former one builds up a document topic model, aiming to a topic perspective solution on searching, exploring and recommending articles. The latter one sets up a user topic model, providing a full research and analysis over Twitter users’ interest. The experiment process including data collecting, data pre-processing and model training is fully documented and commented. Further more, the conclusion and application of this paper could be a useful computation tool for social and business research.

Keywords

topic model, LDA, text mining, probabilistic model

Full Text  Volume 6, Number 6