Realizing a feature-based framework for scientific data mining.

机译：实现基于特征的科学数据挖掘框架。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This dissertation presents an efficient realization of a feature based framework for analyzing scientific data. The main components of the framework include: feature detection, feature classification, feature verification, and modeling the evolutionary behavior of the features. The usefulness of first three steps is shown on datasets originating from computational molecular dynamics. Modeling the evolutionary behavior of the features involves: (i) understanding the trajectory of an individual feature; (ii) discovering the change which features undergo due to various interactions; and (iii) understanding and deriving various spatio-temporal relationships among features.; A rule-based feature detection algorithm extracts the features. These rules are developed by making use of the domain specific properties. The algorithm is highly robust in the presence of noise. The features detected from noisy datasets are consistent with the features detected from noise-free data.; The trajectory of a feature is represented by using physically meaningful parameters: linear velocity, angular velocity and scale parameters. Most of the existing techniques abstract the feature to a single point and only take into account the change in the position. The proposed representation scheme accounts for change in position, orientation and size of the feature. The representation also aids in establishing various spatial and spatio-temporal relationships among the features. The usefulness of the scheme is evaluated on datasets originating from molecular dynamics and fluid flows.; The interactions among co-existing features is captured by a set of critical events: continuation, merging, bifurcation, creation and dissipation. The algorithms establish correspondence among features based on the degree of overlap between the features in consecutive time steps.; Finally, a visual toolkit is developed which aids the user in establishing various spatial and spatial-temporal relationships. The toolkit achieves real time performance. The usefulness of the toolkit is shown on datasets originating from 2D fluid-flow datasets.; Prior to the developed algorithms, manual analysis of a very small dataset of 100 MB used to take around 6 weeks. However, now feature extraction and classification tasks for a 10 GB molecular dynamics dataset can be performed in 25 hours which is faster than data generation time of 35 hours. (Abstract shortened by UMI.)

机译：本文提出了一种基于特征的科学数据分析框架的有效实现。该框架的主要组件包括：特征检测，特征分类，特征验证以及对特征的演化行为进行建模。前三个步骤的有用性在源自计算分子动力学的数据集中显示。对特征的演化行为进行建模涉及：（i）了解单个特征的轨迹；（ii）发现各种交互作用引起的功能变化；（iii）理解和推导特征之间的各种时空关系。基于规则的特征检测算法提取特征。这些规则是通过使用域特定的属性来开发的。该算法在存在噪声的情况下具有很高的鲁棒性。从噪声数据集中检测到的特征与从无噪声数据中检测出的特征一致。使用物理上有意义的参数表示特征的轨迹：线速度，角速度和比例参数。大多数现有技术将特征抽象到一个点，并且仅考虑位置的变化。拟议的表示方案考虑了特征的位置，方向和大小的变化。该表示还有助于在要素之间建立各种空间和时空关系。该方案的有用性是在源自分子动力学和流体流动的数据集上进行评估的。一系列关键事件捕获了并存特征之间的相互作用：连续，合并，分叉，创建和耗散。该算法基于连续时间步长中特征之间的重叠程度在特征之间建立对应关系。最后，开发了一种视觉工具包，可帮助用户建立各种空间和时空关系。该工具包可实现实时性能。该工具包的有用性在源自2D流体数据集的数据集上显示。在开发算法之前，人工分析一个非常小的100 MB数据集通常需要大约6周的时间。但是，现在可以在25小时内完成10 GB分子动力学数据集的特征提取和分类任务，这比35小时的数据生成时间快。（摘要由UMI缩短。）

著录项

作者
Mehta, Sameep.;
展开▼
作者单位

The Ohio State University.;

展开▼
授予单位 The Ohio State University.;
学科 Computer Science.
学位 Ph.D.
年度 2006
页码 197 p.
总页数 197
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. In-situ feature-based objects tracking for data-intensive scientific and enterprise analytics workflows [J] . Lasluisa Solomon, Zhang Fan, Jin Tong, Cluster computing . 2015,第1期

机译：用于数据密集型科学和企业分析工作流的基于特征的对象跟踪
2. Feature-Based Researcher Identification Framework Using Timeline Data [J] . Gim Jangwon, Jang Yunji, Jung Hanmin, Wireless personal communications: An Internaional Journal . 2016,第4期

机译：使用时间轴数据的基于特征的研究者识别框架
3. A Novel Framework for Extracting Visual Feature-Based Keyword Relationships from an Image Database [J] . Marie KATSURAI, Takahiro OGAWA, Miki HASEYAMA IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences . 2012,第5期

机译：一种从图像数据库中提取基于视觉特征的关键字关系的新颖框架
4. UMDISW: A Universal Multi-Domain Intelligent Scientific Workflow Framework for the Whole Life Cycle of Scientific Data [C] . Qi Sun, Yue Liu, Wenjie Tian, International Symposium on Benchmarking, Measuring, and Optimizing . 2019

机译：UMDISW：一个普遍的多域智能科学工作流程框架，用于科学数据的整个生命周期
5. Data fusion in scientific data mining. [D] . Huang, Changjian. 2009

机译：科学数据挖掘中的数据融合。
6. What Counts as Scientific Data? A Relational Framework [O] . Sabina Leonelli -1

机译：什么算作科学数据？关系框架
7. Towards a framework for realizing actionable insight from complex data: A machine-augmented cognition approach to data exploration, information synthesis, and knowledge actualization [O] . Tan Shiang Yen 2017

机译：建立一个从复杂数据中获取可行见解的框架：一种用于数据探索，信息合成和知识实现的机器增强认知方法
8. Enhancement of the Release of Scientific Data in the Framework of Scientific Publishing [R] . Scholosser, P. 2000

机译：加强科技出版框架下科学数据的发布

Realizing a feature-based framework for scientific data mining.

摘要

著录项

相似文献

相关主题

期刊订阅