A New Similarity Metric for Sequential Data

Pradeep Kumar; P. Radha Krishna; Bapi S. Raju

首页> 外文期刊>International Journal of Data Warehousing and Mining >A New Similarity Metric for Sequential Data

【24h】

A New Similarity Metric for Sequential Data

机译：序列数据的新相似性度量

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In many data mining applications, both classification and clustering algorithms require a distance/similarity measure. The central problem in similarity based clustering/classification comprising sequential data is deciding an appropriate similarity metric. The existing metrics like Euclidean, Jaccard, Cosine, and so forth do not exploit the sequential nature of data explicitly. In this paper, the authors propose a similarity preserving Junction called Sequence and Set Similarity Measure (S~3M) that captures both the order of occurrence of items in sequences and the constituent items of sequences. The authors demonstrate the usefulness of the proposed measure for classification and clustering tasks. Experiments were conducted on benchmark datasets, that is, DARPA '98 and msnbc.for classification task in intrusion detection and clustering task in web mining domains. Results show the usefulness of the proposed measure.

机译：在许多数据挖掘应用中，分类算法和聚类算法都需要距离/相似性度量。基于相似度的包括顺序数据的聚类/分类的中心问题是确定适当的相似性度量。现有指标（如欧几里得，雅卡德，余弦等）并未明确利用数据的顺序性质。在本文中，作者提出了一种称为序列和集合相似性度量（S〜3M）的相似性保留连接点，它捕获序列中项的出现顺序和序列的组成项。作者证明了该建议措施对分类和聚类任务的有用性。针对基准数据集（即DARPA '98和msnbc。）进行了实验，以进行Web挖掘域中的入侵检测和聚类任务中的分类任务。结果表明了该措施的有效性。

著录项

来源
《International Journal of Data Warehousing and Mining》 |2010年第4期|共17页
作者
Pradeep Kumar; P. Radha Krishna; Bapi S. Raju;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类矿业工程;
关键词
Sequence Classification; Sequence Clustering; SequenceData; Similarity Measures; Similarity Metric;

机译：序列分类序列聚类序列数据相似度度量相似度;
入库时间 2022-08-18 10:40:40

相似文献

外文文献
中文文献
专利

1. A New Similarity Metric for Sequential Data [J] . Pradeep Kumar, P. Radha Krishna, Bapi S. Raju International Journal of Data Warehousing and Mining . 2010,第4期

机译：序列数据的新相似性度量
2. A novel similarity metric with application to big process data analytics [J] . Zijian Guo, Chao Shang, Hao Ye Control Engineering Practice . 2021,第Auga期

机译：一种新的相似性度量，应用于大进程数据分析
3. Universal Waveshape-Based Disturbance Detection in Power Quality Data Using Similarity Metrics [J] . Bastos Alvaro Furlani, Santoso Surya IEEE Transactions on Power Delivery . 2020,第4期

机译：使用相似度量的电能质量数据中基于万向波的干扰检测
4. A sequential similarity metric for case injected genetic algorithms applied to TSPs [C] . Sushil J.Louis, Yongmian Zhang Genetic and evolutionary computation conference;GECCO-99;International conference on Genetic Algorithms;ICGA-99;Annual genetic programming conference;GP-99 . 1999

机译：应用于TSP的案例注入遗传算法的顺序相似性度量
5. The effect of lineup member similarity on recognition accuracy in simultaneous and sequential lineups. [D] . Flowe, Heather D. 2005

机译：阵容成员相似性对同时和顺序阵容中识别准确性的影响。
6. Characterization of Diffusion Metric Map Similarity in Data From a Clinical Data Repository Using Histogram Distances [O] . Graham C. Warner, Karl G. Helmer 2018

机译：使用直方图距离表征临床数据存储库中数据的扩散度量地图相似度
7. Table S6: A list of all non-metric multidimensional scaling (NMDS), analyses of similarity (ANOSIM), and similarity percentage (SIMPER) analyses performed on fish and benthic data. [O] . -1

机译：表S6：所有非度量多维缩放（NMDS）的列表，相似性分析（Anosim），以及对鱼类和底栖数据进行的相似性百分比（Simper）分析。

A New Similarity Metric for Sequential Data

摘要

著录项

相似文献

相关主题

期刊订阅