Knowledge discovery by fusion of information.

机译：通过信息融合来发现知识。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Information fusion aims to develop intelligent approaches of integrating information from different complementary sources, so that a more comprehensive basis is obtained for data analysis and knowledge discovery. These approaches are particularly interesting in the areas of geo-informatics and text mining in bioinformatics, where huge size of high-dimensional data from multiply correlated sources are available while their inherent relationships have not been well understood. In the first part of the thesis, we study the data-driven approaches to improve aerosol optical thickness (AOT) retrieval performance from satellite-based and ground-based observations. We explore the statistical models which complement deterministic retrieval algorithms to reduce computational costs on huge size of remote sensing data. The experiments showed that, given a small fractions of deterministic retrievals for training, statistical models can significantly speed up AOT retrievals while introducing only a slight accuracy overhead. Next, in order to construct an accurate and understandable AOT predictor, we combine spatially and temporally co-located data from multi-sources into a uniform dataset. Artificial neural networks (ANNs) are applied to derive optimal regression models. The results suggest ANN models achieve overall accuracy superior or comparable to deterministic AOT retrievals. The decision tree analysis reveals ANN predictions effectively enhance deterministic AOT retrievals in some surface or climate conditions. The second part of the thesis addresses the problems of identifying biomedical publications with desired experimental evidence from MEDLINE, a major biomedical repository collecting millions of domain papers from different journals and conference proceedings. The learning task is challenged by richness of biomedical terminology sources, diversity of experimental evidence expressions and a small list of labeled examples for training. We propose a novel substring construction algorithm which derives attributes from semantically-related terms with shared stems or morphemes. With five post-translational modification (PTM) test datasets, curators confirm the selected substrings significantly improve classification performance. Finally, we summarize our work and propose future research directions. Specifically, we describe a framework which exploits online text to explain some aerosol results of satellite image analysis.

机译：信息融合旨在开发整合来自不同互补资源的信息的智能方法，从而为数据分析和知识发现获得更全面的基础。这些方法在地理信息学和生物信息学中的文本挖掘领域特别有趣，在这些领域中，来自多重相关源的大量高维数据可用，而它们的内在联系尚未得到很好的理解。在本文的第一部分中，我们研究了基于数据的方法，旨在通过基于卫星和基于地面的观测来改善气溶胶光学厚度（AOT）的检索性能。我们探索补充确定性检索算法的统计模型，以减少巨大规模的遥感数据的计算成本。实验表明，给定一小部分确定性检索用于训练，统计模型可以显着加快AOT检索的速度，同时仅引入少量的准确性开销。接下来，为了构建准确且易于理解的AOT预测器，我们将来自多个源的时空共处数据组合成统一的数据集。人工神经网络（ANN）用于得出最佳回归模型。结果表明，人工神经网络模型获得的总体准确性优于或可比确定性AOT检索。决策树分析表明，在某些地表或气候条件下，人工神经网络的预测有效地增强了确定性AOT的检索。论文的第二部分解决了从MEDLINE获得鉴定具有所需实验证据的生物医学出版物的问题，MEDLINE是一个主要的生物医学资源库，收集了来自不同期刊和会议论文集的数百万篇领域论文。丰富的生物医学术语资源，实验证据表达的多样性以及一小部分带有标签的培训样本，对学习任务提出了挑战。我们提出了一种新颖的子字符串构造算法，该算法从具有共享词干或词素的语义相关术语中获取属性。通过五个翻译后修饰（PTM）测试数据集，策展人确认所选的子字符串显着提高了分类性能。最后，我们总结我们的工作并提出未来的研究方向。具体来说，我们描述了一个利用在线文本来解释卫星图像分析的一些气溶胶结果的框架。

著录项

作者
Han, Bo.;
展开▼
作者单位

Temple University.;

展开▼
授予单位 Temple University.;
学科 Computer Science.
学位 Ph.D.
年度 2007
页码 118 p.
总页数 118
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. 序列模式中知识发现问题描述与知识发现方法研究 [J] . 殷国富, 姜华, 龙红能, 上海大学学报（英文版） . 2004,第0z1期
2. Exploiting the potential of knowledge management in R&D and drug discovery: extracting value from information. [J] . Scott RK Current opinion in drug discovery & development . 2004,第3期

机译：在研发和药物发现中挖掘知识管理的潜力：从信息中提取价值。
3. On fusion methods for knowledge discovery from multi-omics datasets [J] . Edwin Baldwin, Jiali Han, Wenting Luo, Computational and Structural Biotechnology Journal . 2020,第1期

机译：关于多OMICS数据集的知识发现的融合方法
4. Financing Lead Triggers: Empowering Sales Reps Through Knowledge Discovery and Fusion [J] . Kareem S. Aggour, Bethany Hoogs SIGKDD explorations . 2013,第CDaROM期

机译：融资的主要诱因：通过知识发现和融合增强销售代表
5. Independent Vector Analysis for Molecular Data Fusion: Application to Property Prediction and Knowledge Discovery of Energetic Materials [C] . Zois Boukouvalas, Monica Puerto, Daniel C. Elton, European Signal Processing Conference . 2020

机译：分子数据融合的独立载体分析：对性能预测和高能量材料知识发现的应用
6. Fusion transcript simulation and application in testing fusion discovery methods. [D] . Bruno, Andrew E. 2013

机译：融合转录本仿真及其在测试融合发现方法中的应用。
7. On fusion methods for knowledge discovery from multi-omics datasets [O] . Edwin Baldwin, Jiali Han, Wenting Luo, 2020

机译：从多组学数据集中发现知识的融合方法
8. On fusion methods for knowledge discovery from multi-omics datasets [O] . Edwin Baldwin, Jiali Han, Wenting Luo, 2020

机译：关于多OMICS数据集的知识发现的融合方法
9. Mathematical Basis of Knowledge Discovery and Autonomous Intelligent Architectures. Task No. 2: Rapid Knowledge Fusion in the Scalable Infosphere [R] . Smirnov, A. V. 2005

机译：知识发现和自主智能架构的数学基础。任务2：可扩展的信息空间中的快速知识融合

Knowledge discovery by fusion of information.

摘要

著录项

相似文献

相关主题

期刊订阅