首页> 外文期刊>Systems and Computers in Japan >A Study of Document Filtering Using the Subspace Method of Pattern Recognition
【24h】

A Study of Document Filtering Using the Subspace Method of Pattern Recognition

机译:基于模式识别子空间方法的文档过滤研究

获取原文
获取原文并翻译 | 示例
           

摘要

One of the typical filtering approaches, in which no constraint is imposed on the document structure, is the method based on the vector space model. In this method, the words composing the document are assigned to the elements of the vector, and the document is selected statis- tically by the similarity in the constructed feature vector space. Hitherto, the documents have been ranked in the order of high similarity and the one-dimensional filtering is considered, based on the specified number of documents in the upper rankings or a preset threshold. A problem in this filtering is that only the necessary information is se- lected following the order and noise elimination to reject the unnecessary information is not directly considered. From such a viewpoint, this study takes the approach of considering the user's interest, both positive and negative. The filtering is considered as the corresponding two-cate- gory problem, and a filtering method is proposed based on the concept of the pattern classification. This method uses the subspace method, which is known as a method for pattern recognition. The proposed method consists of highly precise filtering introducing the co-occurrence rela- tion among words, and realizes the representation and updating of the interest items by a single mechanism. The effectiveness of the proposed method is demonstrated by experiments using newspaper articles.
机译:基于向量空间模型的方法是不对文档结构施加任何约束的典型过滤方法之一。在这种方法中,将构成文档的单词分配给向量的元素,然后根据所构建的特征向量空间中的相似度从统计角度选择文档。迄今为止,已经按照高相似度的顺序对文档进行了排名,并且基于排名较高的文档中的指定数量或预设阈值,考虑了一维过滤。这种过滤的问题在于,仅按照顺序选择了必要的信息,并且没有直接考虑消除噪声以拒绝不必要的信息。从这种观点出发,本研究采用考虑用户兴趣(积极和消极)的方法。过滤被认为是对应的两类问题,并基于模式分类的概念提出了一种过滤方法。该方法使用子空间方法,该方法被称为模式识别方法。所提出的方法包括高精度过滤,引入单词之间的共现关系,并通过单一机制实现兴趣项的表示和更新。通过使用报纸文章的实验证明了该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号