Efficient System for Clustering of Dynamic Document Database

机译：动态文档数据库集群的高效系统

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

We describe in this paper, a system that groups, classifies and finds the latent semantic features in a database composed of a large number of documents. The database will be constantly growing as users who co-create it will be adding more and more new documents. Users require a system to provide them information, both about a specific document, and about the entire set of documents. This information includes statistical data about words in documents, information about aspects in which this words appears, classification, clustering, etc. To meet these expectations we propose using methods for searching for hidden patterns in multivariable data. We apply machine learning algorithms for data analysis, useful in identifying local patterns in mul-tivariate data. We consider two different algorithms described in the literature (1) Probabilistic Latent Semantic Analysis Method [2] and (2) Nonnegative Matrix Factorization algorithm described in [4] and used in the text analysis system [1].

机译：我们在本文中描述了一个对包含大量文档的数据库进行分组，分类和查找潜在语义特征的系统。随着共同创建数据库的用户将添加越来越多的新文档，该数据库将不断增长。用户需要一个系统来为他们提供有关特定文档和整个文档集的信息。这些信息包括有关文档中单词的统计数据，有关单词出现的方面，分类，聚类等方面的信息。为了满足这些期望，我们建议使用在多变量数据中搜索隐藏模式的方法。我们将机器学习算法应用于数据分析，可用于识别多变量数据中的局部模式。我们考虑文献中描述的两种不同算法（1）概率潜在语义分析方法[2]和（2）非负矩阵因式分解算法[4]描述并用于文本分析系统[1]。

著录项

来源
《Cooperative design, visualization, and engineering》|2011年|p.186-189|共4页
会议地点 Hong Kong(HK);Hong Kong(HK)
作者
Pawel Foszner; Aleksandra Gruca; Andrzej Polanski;
展开▼
作者单位

Silesian University of Technology, Institute of Informatics, Akademicka 16, 44-100 Gliwice, Poland;

Silesian University of Technology, Institute of Informatics, Akademicka 16, 44-100 Gliwice, Poland;

Silesian University of Technology, Institute of Informatics, Akademicka 16, 44-100 Gliwice, Poland;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
clustering; classification; NMF; semantic features; document database;

机译：集群分类; NMF；语义特征；文件资料库;

相似文献

外文文献
中文文献
专利

1. Efficiently querying dynamic XML documents stored in relational database systems [J] . Moad Maghaydah, Mehmet A. Orgun International journal of intelligent information and database systems . 2011,第4期

机译：有效查询关系数据库系统中存储的动态XML文档
2. An Efficient Clustering System for the Measure of Page (Document) Authoritativeness [J] . F. U. Ogban, P. O. Asagba, Olumide Owolabi Journal of Information Engineering and Applications . 2014,第6期

机译：衡量页面（文档）权威性的有效聚类系统
3. Dynamics of the Globular Cluster System Associated with M87 (NGC 4486). I. New CFHT MOS Spectroscopy and the Composite Database [J] . David A. Hanes1234, Patrick C?té1567, Terry J. Bridges1389, The Astrophysical journal . 2008,第2期

机译：与M87（NGC 4486）相关的球状星团系统的动力学。 I.新的CFHT MOS光谱学和复合数据库
4. Efficient System for Clustering of Dynamic Document Database [C] . Pawel Foszner, Aleksandra Gruca, Andrzej Polanski International conference on cooperative design, visualization, and engineering . 2011

机译：动态文档数据库群集的高效系统
5. XML2REL: An efficient system for storing and querying XML documents using relational databases [D] . Atay, Mustafa 2006

机译：XML2REL：使用关系数据库存储和查询XML文档的有效系统
6. Simulation of Quantum Dynamics of Excitonic Systems at Finite Temperature: an efficient method based on Thermo Field Dynamics [O] . Raffaele Borrelli, Maxim F. Gelin -1

机译：有限温度下激子系统的量子动力学模拟：一种基于热场动力学的有效方法
7. Efficient Database-Driven Evaluation of Security Clearance for Federated Access Control of Dynamic XML Documents [O] . Erwin Leonardi, Sourav S. Bhowmick, Mizuho Iwaihara 2010

机译：对动态XML文档联合访问控制的安全清除进行数据库驱动的有效评估

Efficient System for Clustering of Dynamic Document Database

摘要

著录项

相似文献

相关主题

期刊订阅