A Large number of digital text information is generated every day. Effectively searching,managing and exploring the text data has become a main task. In this paper, we first representan introduction to text mining and a probabilistic topic model Latent Dirichlet allocation. Thentwo experiments are proposed - Wikipedia articles and users’ tweets topic modelling. Theformer one builds up a document topic model, aiming to a topic perspective solution onsearching, exploring and recommending articles. The latter one sets up a user topic model,providing a full research and analysis over Twitter users’ interest. The experiment processincluding data collecting, data pre-processing and model training is fully documented andcommented. Further more, the conclusion and application of this paper could be a usefulcomputation tool for social and business research.
展开▼