首页> 外文会议>International conference on current trends in theory and practice of computer science >Using n-grams for the Automated Clustering of Structural Models
【24h】

Using n-grams for the Automated Clustering of Structural Models

机译:使用n-gram对结构模型进行自动聚类

获取原文

摘要

Model comparison and clustering are important for dealing with many models in data analysis and exploration, e.g. in domain model recovery or model repository management. Particularly in structural models, information is captured not only in model elements (e.g. in names and types) but also in the structural context, i.e. the relation of one element to the others. Some approaches involve a large number of models ignoring the structural context of model elements; others handle very few (typically two) models applying sophisticated structural techniques. In this paper we address both aspects and extend our previous work on model clustering based on vector space model, with a technique for incorporating structural context in the form of n-grams. We compare the n-gram accuracy on two datasets of Ecore metamodels in AtlanMod Zoo: small random samples using up to trigrams and a larger one (~100 models) up to bigrams.
机译:模型比较和聚类对于处理数据分析和探索中的许多模型非常重要,例如在域模型恢复或模型存储库管理中。特别是在结构模型中,信息不仅在模型元素(例如,名称和类型)中捕获,而且在结构上下文中,即一个元素与其他元素之间的关系捕获。一些方法涉及大量模型,而忽略了模型元素的结构上下文;其他人则很少使用复杂的结构技术处理模型(通常是两个)。在本文中,我们针对这两个方面进行了研究,并扩展了我们先前基于向量空间模型的模型聚类工作,并采用了一种以n-gram形式合并结构上下文的技术。我们在AtlanMod Zoo的两个Ecore元模型数据集中比较了n元语法的准确性:使用三元组的较小随机样本和使用二元组的较大一个(〜100个模型)随机样本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号