The extraction of keywords from document text is a hot research area. As machine learning techniques have been applied to many fields successfully, this study aims to explore how to optimize keyword extraction using Support Vector Machine for Ranking (SVMRank). Firstly, we constructed some features for each candidate word segmented from a ocument by employing the output rank of certain traditional extraction algorithms, such as TF-IDF, Text Rank, and LDA. econdly, we labeled each candidate with an important rank through artificial auxiliary. Finally, we built up a SVMRank odel to learn how to rank the candidates. The most important advantage of this approach is that it can integrate the dvantages of other keyword extraction methods and overcome their shortcomings. The experiment results show that the SVMRank approach would improve the extraction precision and recall by 6% and 5%, respectively.
展开▼