首页>
外国专利>
APPARATUS, METHOD AND COMPUTER PROGRAM FOR DOCUMENT CLASSIFICATION USING TERM ASSOCIATION ANALYSIS
APPARATUS, METHOD AND COMPUTER PROGRAM FOR DOCUMENT CLASSIFICATION USING TERM ASSOCIATION ANALYSIS
展开▼
机译:利用术语关联分析进行文件分类的装置,方法和计算机程序
展开▼
页面导航
摘要
著录项
相似文献
摘要
A document classification device includes: an input unit which is configured to receive multiple original documents respectively including multiple words; an importance analysis unit which analyzes relatively important words among the words included in the original documents to determine a first word set; a correlation analysis unit which uses the correlation between the words included in the first word set to determine a second word set; and a document classification unit which uses the second word set to classify the original documents. The document classification device uses the word correlation analysis to extract a feature set to be used for classification in order to remove noise terms with low importance and reflect the features of the domains of the original documents in the data analysis, thereby enhancing the performance and processing speed of the document classification compared to the conventional counterpart.
展开▼