Research on Example-based Text Categorization

机译：基于示例的文本分类研究

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The goal of text categorization is the automatic classification of documents into predefined categories. Text categorization usually training text corpus to create a classifier by machine learning technology, then analyses and compares the features of unlabeled documents with that of the classes of the classifier to classify it into the most similar category. Some algorithms support this method, such as Nearest Neighbor, Naive Bayes, Support Vector Machine, etc. This method has some disadvantages, such as complicated algorithms, fewer numbers and lower levels of classes. This paper proposes a new method of text categorization from a new angle. It uses manual indexing experiences, coming from some large bibliographic databases, to construct an example base for automatic classification. Each record of the base is an indexing record, including cross concordance of class numbers and strings. It can be used to realize text categorization through computing the similarity between feature strings of unlabeled documents with each indexing examples. Empirical results prove that this method have many advantages, such as simpler computation, more numbers and deeper levers of classes. This paper will introduce its algorithm, method of construction of example base for classification, and performance of system at length.

机译：文本分类的目标是将文档自动分类为预定义类别。文本分类通常培训文本语料库通过机器学习技术创建分类器，然后通过分类器的类分析并将未标记文档的功能进行分析，以将其分类为最相似的类别。一些算法支持这种方法，例如最近的邻居，天真贝叶斯，支持向量机等。该方法具有一些缺点，例如复杂的算法，数量较少，较低的类。本文提出了一种从新角度进行文本分类的新方法。它使用来自一些大型书目数据库的手动索引体验，构建用于自动分类的示例基础。基数的每个记录都是索引记录，包括类编号和字符串的交叉一致性。它可用于通过计算具有每个索引示例的未标记文档的特征字符串之间的相似性来实现文本分类。经验结果证明，这种方法具有许多优点，例如更简单的计算，更多的数量和更深的杠杆。本文将介绍其算法，施工方法的分类底座，以及系统的性能。

著录项

来源
《International Conference on Universal Digital Library》|2005年||共7页
会议地点
作者
XUE Chun-xiang; HOU Han-qing; Minstry of Education(MOE) of China; US National Science Foundation(NSF); Indian Institute of Science(IISc);
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类电子图书馆、数字图书馆;
关键词
automatic classification; example-based text categorization; example base for classification;

机译：自动分类;基于示例的文本分类;分类示例基础;

相似文献

外文文献
中文文献
专利

1. Contextual Text Categorization: An Improved Stemming Algorithm to Increase the Quality of Categorization in Arabic Text [J] . Gadri Said, Moussaoui Abdelouahab The international arab journal of information technology . 2017,第6期

机译：上下文文本分类：一种改进的词干算法，可提高阿拉伯文本分类的质量
2. Interactive example-based finding of text items [J] . Medvet Eric, Bartoli Alberto, De Lorenzo Andrea, Expert systems with applications . 2020,第Sepa期

机译：基于互动的示例的文本项目的查找
3. Text-Informed Audio Source Separation. Example-Based Approach Using Non-Negative Matrix Partial Co-Factorization [J] . Luc Le Magoarou, Alexey Ozerov, Ngoc Q. K. Duong Journal of VLSI signal processing . 2015,第2期

机译：文本通知的音频源分离。使用非负矩阵部分协因子化的基于示例的方法
4. Research on Example-based Text Categorization [C] . XUE Chun-xiang, HOU Han-qing International Conference on Universal Digital Library(ICUDL2005); 20051031-1102; Hangzhou(CN) . 2005

机译：基于实例的文本分类研究
5. The implementation of dynamic document organization using the integration of text clustering and text categorization. [D] . Jo, Taeho. 2006

机译：使用文本聚类和文本分类的集成来实现动态文档组织。
6. BICEPP: an example-based statistical text mining method for predicting the binary characteristics of drugs [O] . Frank PY Lin, Stephen Anthony, Thomas M Polasek, 2011

机译：BICEPP：基于实例的统计文本挖掘方法用于预测药物的二元特征
7. Neural Text Categorizer for Exclusive Text Categorization [O] . Taeho Jo 2009

机译：神经文本分类器，用于排他性文本分类

Research on Example-based Text Categorization

摘要

著录项

相似文献

相关主题

期刊订阅