首页> 外文会议>International Conference on Cyberspace Technology >Study of test classification algorithm based on domain knowledge
【24h】

Study of test classification algorithm based on domain knowledge

机译:基于领域知识的考试分类算法研究

获取原文

摘要

When facing various and massive data resources, how to effectively utilize the resources according to the division field is one of the core problem of the institutional repository research. In this paper, we improved Bayesian classification algorithm, then proposed a text classification algorithm based on domain knowledge. Furthermore, some key technologies such as text classification, feature selection, weight improvement and domain knowledge algorithm improvement are designed and implemented. We use widely applied IkAnalyzer method to classify Chinese words. For feature selection and weight improvement part, we focus on the processing of special vocabulary in the document. We introduce the field expand vocabulary assist the Bayesian formula in the field application part to obtain the final result. The experiment result shows that the improved algorithm enhanced the accuracy of the classification efficiently, and the system calculating time is acceptable.
机译:当面对各种各样的海量数据资源时,如何根据部门划分有效地利用资源是机构知识库研究的核心问题之一。本文对贝叶斯分类算法进行了改进,提出了一种基于领域知识的文本分类算法。此外,还设计并实现了一些关键技术,例如文本分类,特征选择,权重改进和领域知识算法改进。我们使用广泛应用的IkAnalyzer方法对中文单词进行分类。对于特征选择和权重改善部分,我们重点介绍文档中特殊词汇的处理。我们在现场应用部分中引入了领域扩展词汇辅助贝叶斯公式以获得最终结果。实验结果表明,改进后的算法有效地提高了分类的准确性,系统的计算时间是可以接受的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号