Nonparametric methods for automatic classification of documents and transactions (abstract)

机译：自动分类凭证和交易的非参数方法（摘要）

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The question of how to classify documents is a central problem in document retrieval. The classification problem can be stated as follows. There exists a large document collection, each of which contains a set of terms. How should the documents be clustered to allow the selection of index terms so that the collection can be searched to the maximal collective benefit of the retrieval system customers? Traditionally, transaction functionalities are manually scheduled into deferred and immediate queues for processing without any special consideration given to the interwoven functionalities invoked by the different user groups. The question of how to classify transactions for the concurrency controller in a distributed system is a major problem in transaction scheduling. The problem here is, how should transaction functionalities be scheduled for processing to satisfy the requirements of the different user groups? That is, how should transaction functionalities be organized on disk to minimize diskaccess time, in the hope of fulfilling the requirements of individual user groups?

This paper presents nonparametric algorithms and heuristic for automatic classification of documents according to the similarity in their keywords; the words likely to be useful as index terms for document set. The normal approximation to the binomial distribution was explored as an index for automatic classification of documents and transactions. A nonparametric measure of association consistent with the Cramer statistic was used in the examination of similarities among documents. A nonparametric analysis of variance procedure was developed for comparing the profiles of term frequencies between documents or transaction functionalities invoked between users. The usefulness of the heuristic in the automatic classification of user groups according to the transaction functionalities that they invoke in a distributed system is discussed.

机译：

如何对文档进行分类是文档检索中的核心问题。分类问题可以描述如下。存在一个大型文档集合，每个文档集合都包含一组术语。应如何将文档聚类以允许选择索引项，以便可以对馆藏进行搜索，以使检索系统客户获得最大的集体利益？传统上，将交易功能手动调度到延迟队列和即时队列中进行处理，而无需特别考虑不同用户组调用的交织功能。如何在分布式系统中为并发控制器对事务进行分类的问题是事务调度中的主要问题。这里的问题是，应该如何安排交易功能以进行处理以满足不同用户组的需求？也就是说，如何在磁盘上组织事务功能以最大程度地减少磁盘访问时间，以期满足各个用户组的需求？ rn

本文提出了非参数算法和启发式算法，用于根据文档自动分类他们关键字的相似性；这些词可能会用作文档集的索引词。探索了二项分布的正态近似作为自动分类文档和交易的指标。在检查文档之间的相似性时，使用了与Cramer统计一致的非参数关联度量。开发了一种非参数方差分析程序，用于比较用户之间调用的文档或交易功能之间的词频分布。讨论了启发式方法在根据分布式系统中调用的交易功能对用户组进行自动分类中的作用。展开▼

著录项

来源
《Proceedings of the 1990 ACM annual conference on Cooperation》|1990年|P.1-1|共1页
会议地点 Washington DC(US)
作者
Amos O. Olagunju;
展开▼
作者单位

Department of Mathematics Computer Science, N. C. A T State University, Greensboro, NC;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. A Review on Classification and Comparison of Automatic Logo Based Document Image Retrieval Methods and other Applications [J] . Raveendra K., P. V. N. Reddy, P. V. V. Kishore International Journal of Applied Engineering Research . 2017,第24aPta9期

机译：基于自动徽标的文档图像检索方法和其他应用的分类和比较综述
2. A comparative study of two automatic document classification methods in a library setting [J] . Joanna Yi-Hang Pong, Ron Chi-Wai Kwok, Raymond Yiu-Keung Lau, Journal of Information Science . 2008,第2期

机译：图书馆环境中两种自动文档分类方法的比较研究
3. Automatic Classification Methods for Electronic Text Documents [J] . O. V. Peskova Automatic Documentation and Mathematical Linguistics . 2006,第2期

机译：电子文本文档的自动分类方法
4. An automatic kernel parameter selection method for kernel nonparametric weighted feature extraction with the RBF kernel for hyperspectral image classification [C] . Pei-Jyun Hsieh, Cheng-Hsuan Li, Bor-Chen Kuo, IEEE International Geoscience and Remote Sensing Symposium . 2015

机译：基于RBF核的核非参数加权特征提取的高光谱图像自动核参数选择方法
5. Data mining revision controlled document history metadata for automatic classification. [D] . Maass, Dustin. 2013

机译：数据挖掘修订版本控制的文档历史记录元数据，用于自动分类。
6. Automatic document classification of biological literature [O] . David Chen, Hans-Michael Müller, Paul W Sternberg 2006

机译：生物文献的自动文件分类
7. A Review on Classification and Comparison of Automatic Logo Based Document Image Retrieval Methods and other Applications [O] . Raveendra K, P V N Reddy, P V V Kishore 2017

机译：基于自动徽标的文档图像检索方法和其他应用的分类和比较综述

Nonparametric methods for automatic classification of documents and transactions (abstract)

摘要

著录项

相似文献

相关主题

期刊订阅