...
首页> 外文期刊>International journal on engineering applications >GraphConnect: Framework of Discovering Closed Highly Connected Pattern from Semistructured Dataset
【24h】

GraphConnect: Framework of Discovering Closed Highly Connected Pattern from Semistructured Dataset

机译:GraphConnect:从Semistructured DataSet发现已关闭高度连接模式的框架

获取原文
获取原文并翻译 | 示例
           

摘要

Semistructured data appears when the source does not impose a rigid structure on the data, such as the web, or when data is combined from several heterogeneous sources. In mathematical terms, we called semi structured data set as graph data set One particular interesting in mining semi structured pattern is finding frequent highly connected subgraph in large relational graphs. The common problem is to find not only frequent graphs, but also graphs that satisfy the connectivity constraint. We identify three major characteristics different from the previous frequent graph mining problem, first, in relational graphs each node represents a distinct object. No two nodes share the same label. In biological networks, nodes often represent unique objects like genes and enzymes. Secondly, relational graphs may be very large. Thirdly, the interesting patterns should not only be frequent but also satisfy the connectivity constraint. . In order to handle these new challenges, we identify two issues have to be solved: (1) how to mine frequent graphs efficiently in large relational graphs, and (2) how to handle the connectivity constraint. Since frequent graph mining usually generates too many patterns, it is more appealing to mine closed frequent graphs only. Our major contribution is to tackle the connectivity constraint. We use the minimum cut criterion to measure the connectivity of a pattern and examine the issues of integrating the connectivity constraint with the closed graph mining process.
机译:当源不会在数据(例如Web)上的刚性结构上施加刚性结构时,或者从多个异构源组合时,出现半系统数据。在数学术语中,我们称为半结构化数据集作为图形数据集,特别是挖掘半结构化模式的一个特别有趣的是在大关系图中找到频繁高度连接的子图。常见问题是不仅找到频繁的图形,还可以找到满足连接约束的图表。我们确定与先前频繁的图形挖掘问题不同的三个主要特征,首先,在关系图中,每个节点代表一个不同的对象。没有两个节点共享相同的标签。在生物网络中,节点通常代表基因和酶等独特的物体。其次,关系图可能非常大。第三,有趣的模式不仅应该频繁,但也满足连接约束。 。为了处理这些新的挑战,我们确定了两个问题要解决:(1)如何在大关系图中有效地挖掘频繁的图表,以及(2)如何处理连接约束。由于频繁的图形挖掘通常会产生太多模式,因此仅对挖掘封闭频繁的图形更具吸引力。我们的主要贡献是解决连接约束。我们使用最小剪切标准来测量模式的连接,并检查与闭面的图形挖掘过程集成了连接约束的问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号