Using the Google Similarity Distance for OLAP Textual Aggregation

机译：使用Google相似距离进行OLAP文本聚合

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the tremendous growth of unstructured data in the Business Intelligence, there is a need for incorporating textual data into data warehouses, to provide an appropriate multidimensional analysis (OLAP) and develop new approaches that take into account the textual content of data. This will provide textual measures to users who wish to analyse documents online. In this paper, we propose a new aggregation function for textual data in an OLAP context. For aggregating keywords, our contribution is to use a data mining technique, such as k-means, but with a distance based on the Google similarity distance. Thus our approach considers the semantic similarity of keywords for their aggregation. The performance of our approach is analyzed and compared to another method using the k-bisecting clustering algorithm and based on the Jensen-Shannon divergence for the probability distributions. The experimental study shows that our approach achieves better performances in terms of recall, precision, F-measure complexity and runtime.

机译：随着非结构化数据的商业智能的巨大增长，有必要用于将文本数据到数据仓库，为客户提供适当的多维分析（OLAP），并制定考虑到数据的文本内容的新方法。这将为希望在线分析文件的用户提供文本措施。在本文中，我们提出了在OLAP上下文中的文本数据的新聚合函数。对于聚合关键字，我们的贡献是使用数据挖掘技术，例如K-means，但基于Google相似距离的距离。因此，我们的方法考虑了它们的聚合的关键字的语义相似性。通过使用K-Boting聚类算法的另一种方法分析我们方法的性能，并基于Jensen-Shannon发散的概率分布。实验研究表明，我们的方法在召回，精度，F测量复杂性和运行时实现了更好的表现。

著录项

来源
《International Conference on Enterprise Information Systems》|2015年||共7页
会议地点
作者
Mustapha Bouakkaz; Sabine Loudcher; Youcef Ouinten;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 G20-53;
关键词
OLAP; Textual Data; Aggregation Function; Google Similrity;

机译：OLAP;文本数据;聚合功能;Google相似性;

相似文献

外文文献
中文文献
专利

1. OLAP textual aggregation approach using the Google similarity distance [J] . Mustapha Bouakkaz, Sabile Loudcher, Youcef Ouinten International Journal of Business Intelligence and Data Mining . 2016,第1期

机译：使用Google相似度距离的OLAP文本聚合方法
2. Textual aggregation approaches in OLAP context: A survey [J] . Bouakkaz Mustapha, Ouinten Youcef, Loudcher Sabine, International Journal of Information Management . 2017,第6期

机译：OLAP上下文中的文本聚合方法：一项调查
3. Automatic keyword prediction using Google similarity distance [J] . Ping-I Chen, Shi-Jen Lin Expert systems with applications . 2010,第3期

机译：使用Google相似度距离自动进行关键字预测
4. Using the Google Similarity Distance for OLAP Textual Aggregation [C] . Mustapha Bouakkaz, Sabine Loudcher, Youcef Ouinten International Conference on Enterprise Information Systems . 2015

机译：使用Google相似距离进行OLAP文本聚合
5. Partial aggregation and query processing of OLAP cubes. [D] . Guo, Zhenshan. 2009

机译：OLAP多维数据集的部分聚合和查询处理。
6. Distributed representation and one-hot representation fusion with gated network for clinical semantic textual similarity [O] . Ying Xiong, Shuai Chen, Haoming Qin, 2020

机译：门控网络的分布式表示和一站式表示融合用于临床语义文本相似度
7. A New Tool for Textual Aggregation in OLAP Context [O] . Bouakkaz Mustapha, Loudcher Sabine, Ouinten Youcef 2016

机译：OLAP上下文中文本聚合的新工具

Using the Google Similarity Distance for OLAP Textual Aggregation

摘要

著录项

相似文献

相关主题

期刊订阅