The paper proposes an approach to analyze citizens' comments in e-government using topic modeling and clustering algorithms. The main purpose of the proposed approach is to determine what topics are the citizens' commentaries about written in the e-government environment and to improve the quality of e-services. One of the methods used to determine this is topic modeling methods. In the proposed approach, first citizens' comments are clustered and then the topics are extracted from each cluster. Thus, we can determine which topics are discussed by citizens. However, in the usage of clustering and topic modeling methods appear some problems. These problems include the size of the vectors and the collection of semantically related of documents in different clusters. Considering this, the semantic similarity of words is used in the approach to reduce measure. Therefore, we only save one of the words that are semantically similar to each other and throw the others away. So, the size of the vector is reduced. Then the documents are clustered and topics are extracted from each cluster. The proposed method can significantly reduce the size of a large set of documents, save time spent on the analysis of this data, and improve the quality of clustering and LDA algorithm.
展开▼