基于模糊分类规则树的文本分类

郭玉琴; 袁方; 刘海博

首页> 外文期刊>东南大学学报（英文版） >基于模糊分类规则树的文本分类

【24h】

基于模糊分类规则树的文本分类

机译：基于模糊分类规则树的文本分类

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

针对传统的基于关联规则的文本分类方法在分类文本时需要遍历分类器中的所有规则,分类效率非常低的问题,提出一种基于模糊分类规则树(FCR-tree)的文本分类方法.分类器中的规则以树的形式存储,由于树型结构避免了重复结点的存储,节省了存储空间.模糊分类关联规则与一般分类规则相比,不仅包含了词条信息,还包含了词条出现频度对应的模糊集,所以FCR-tree的构建过程及树的结构不同于一般规则树CR-tree.为降低构建及遍历FCR-tree的难度,采用了构造多棵k-FCR-tree的方法.在搜索规则树时,如果结点中的词条没在待分类文本中出现,则不需要再搜索该结点引导的子树,大大减少了需要匹配的规则的数量.实验表明该方法是可行的,与遍历分类器的分类方法相比,分类效率有了明显提高.%To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts, which has low efficiency, a new approach based on the FCR-tree (fuzzy classification rules tree)for text categorization is proposed. The compactness of the FCR-tree saves significant space in storing a large set of rules when there are many repeated words in the rules. In comparison with classification rules, the fuzzy classification rules contain not only words, but also the fuzzy sets corresponding to the frequencies of words appearing in texts. Therefore, the construction of an FCR-tree and its structure are different from a CR-tree. To debase the difficulty of FCR-tree construction and rules retrieval, more k-FCR-trees are built. When classifying a new text, it is not necessary to search the paths of the sub-trees led by those words not appearing in this text, thus reducing the number of traveling rules. Experimental results show that the proposed approach obviously outperforms the conventional method in efficiency.

机译：针对传统的基于关联规则的文本分类方法在分类文本时需要遍历分类器中的所有规则,分类效率非常低的问题,提出一种基于模糊分类规则树(FCR-tree)的文本分类方法.分类器中的规则以树的形式存储,由于树型结构避免了重复结点的存储,节省了存储空间.模糊分类关联规则与一般分类规则相比,不仅包含了词条信息,还包含了词条出现频度对应的模糊集,所以FCR-tree的构建过程及树的结构不同于一般规则树CR-tree.为降低构建及遍历FCR-tree的难度,采用了构造多棵k-FCR-tree的方法.在搜索规则树时,如果结点中的词条没在待分类文本中出现,则不需要再搜索该结点引导的子树,大大减少了需要匹配的规则的数量.实验表明该方法是可行的,与遍历分类器的分类方法相比,分类效率有了明显提高.%To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts, which has low efficiency, a new approach based on the FCR-tree (fuzzy classification rules tree)for text categorization is proposed. The compactness of the FCR-tree saves significant space in storing a large set of rules when there are many repeated words in the rules. In comparison with classification rules, the fuzzy classification rules contain not only words, but also the fuzzy sets corresponding to the frequencies of words appearing in texts. Therefore, the construction of an FCR-tree and its structure are different from a CR-tree. To debase the difficulty of FCR-tree construction and rules retrieval, more k-FCR-trees are built. When classifying a new text, it is not necessary to search the paths of the sub-trees led by those words not appearing in this text, thus reducing the number of traveling rules. Experimental results show that the proposed approach obviously outperforms the conventional method in efficiency.

著录项

来源
《东南大学学报（英文版）》 |2008年第3期|339-342|共4页
作者
郭玉琴; 袁方; 刘海博;
展开▼
作者单位

河北大学数学与计算机学院,保定071002;

中国人民银行天津分行,天津300040;

河北大学数学与计算机学院,保定071002;

河北大学数学与计算机学院,保定071002;

展开▼
收录信息
原文格式 PDF
正文语种 chi
中图分类计算机网络;
关键词
文本分类; 模糊分类关联规则; 分类规则树; 模糊分类规则树;

相似文献

外文文献
中文文献
专利

1. A New Fuzzy Hierarchical Classification Based on SVM for Text Categorization [C] . Taoufik Guernine, Kacem Zeroual International Conference on image analysis and recognition;ICIAR 2009 . 2009

机译：基于SVM的文本分类新的模糊层次分类法。
2. Application de la classification textuelle pour l'extraction des regles d'association maximales [D] . Hilali, Hassane. 2009

机译：文本分类在最大关联规则提取中的应用
3. Identifying influenza-like illness presentation from unstructured general practice clinical narrative using a text classifier rule-based expert system versus a clinical expert [O] . Jayden MacRae, Tom Love, Michael G. Baker, BMC Medical Informatics and Decision Making . 2015

机译：使用基于文本分类器规则的专家系统与临床专家的比较，从非结构化的常规临床叙述中识别类似流感的疾病表现
4. Text Categorization Based on Classification Rules Tree by Frequent Patterns [O] . Chen Xiao-yun, Chen Yi, Wang Lei, 2014

机译：基于频繁模式的分类规则树的文本分类
5. Natural Language Text Classification and Filtering with Trigrams and Evolutionary Nearest Neighbour Classifiers. Software Engineering (SEN). [R] . Langdon, W. B. 2000

机译：基于Trigrams和进化最近邻分类器的自然语言文本分类和过滤。软件工程（sEN）。

基于模糊分类规则树的文本分类

摘要

著录项

相似文献

相关主题

期刊订阅