首页> 外国专利> Method for multi-phase category assignment on text categorization system

Method for multi-phase category assignment on text categorization system

机译:文本分类系统中多阶段类别分配的方法

摘要

1. TECHNICAL FIELD OF THE INVENTION;The present invention relates to a multi-category assignment method in a document automatic classification system and a computer-readable recording medium having recorded thereon a program for realizing the method.;2. Technical problem to be solved by the invention;The present invention relates to a computer that records a multi-category assignment method and a program for realizing the method by assigning an appropriate category to each document in a document automatic classification system. To provide a readable recording medium.;3. Summary of the Solution of the Invention;A multi-category assignment method in an automatic document classification system, comprising: a first step of selecting words capable of predicting a category and constructing a list of category / word pairs; A second step of referring to the list of category / word pairs, expressing the learning documents as corresponding words and their importance, and storing the learning documents in an inverted index file; Selecting, from a learning document set, example documents most similar to the documents to be newly classified among the learning documents; And calculating, for each new document to be classified, the probability of the category to be classified, selecting only the categories having the highest probability one by one in each step, and assigning the category with the highest classification possibility to the new document at each step. Includes 4 levels.;4. Important uses of the invention;The present invention is used for automatic sorting of documents.
机译:技术领域本发明涉及一种文档自动分类系统中的多类别分配方法以及在其上记录有用于实现该方法的程序的计算机可读记录介质。本发明要解决的技术问题;本发明涉及一种记录多类别分配方法的计算机和一种通过在文档自动分类系统中为每个文档分配适当的类别来实现该方法的程序。提供可读的记录介质; 3。发明内容本发明的目的是提供一种自动文档分类系统中的多类别分配方法,该方法包括:选择能够预测类别的词语并构建类别/词语对列表的第一步;第二步,参考类别/单词对列表,将学习文档表达为对应的单词及其重要性,并将学习文档存储在反向索引文件中;从学习文档集中选择与要在学习文档中新分类的文档最相似的示例文档;然后,对于每个要分类的新文档,计算要分类的类别的概率,在每一步骤中仅一个选择一个概率最高的类别,并在每个步骤中将具有最高分类可能性的类别分配给新文档。步。包括4个级别; 4。本发明的重要用途;本发明用于文件的自动分类。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号