首页> 外文学位 >Multi-sourced Information Trustworthiness Analysis: Applications and Theory

【24h】

Multi-sourced Information Trustworthiness Analysis: Applications and Theory

机译：多源信息可信度分析：应用与理论

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In the era of Big Data, data entries, even describing the same objects or events, can come from a variety of sources. There are some sources that typically provide accurate information, but due to various reasons such as recording errors, device malfunction, background noise and intent to manipulate the data, some other sources may contain noisy or even erroneous information. Therefore, it is inevitable that information from multiple sources is conflicting with each other. To discover useful knowledge, which is usually deeply buried in those complicate multi-sourced data, we have to conduct information trustworthiness analysis on all available data sources. In this thesis, we propose a series of approaches of multi-sourced information trustworthiness analysis, including reliability-aware information integration and inconsistency detection to efficiently and effectively discover both trustworthy and untrustworthy information, respectively.;In reliability-aware information integration, it is critical to identify reliable sources that more often provide accurate information, so we can pay more attention on their information to better discover the truths (i.e., trustworthy information). Unfortunately, there is no oracle telling us which information source is more reliable a priori. To correctly identify the truths, in Part I of this thesis, we develop novel information integration methods that incorporate the estimation of source reliability. We explore the power of source reliability estimation in both data-level and model-level information. The objective is to jointly estimate which source is reliable and which piece of information is correct, where the information could be the raw data in data-level information integration or the model parameter in model-level information integration. In this part, we proved some nice properties of the proposed approaches via theoretical analysis and demonstrated their impacts on some real applications, such as indoor floorplan construction and crowdsourced question answering.;On the other hand, when unexpected disagreement is encountered across diverse information sources, i.e. data entities receive inconsistent information across multiple data sources, this might raise a red flag and require in-depth investigation. The Part II of my thesis research is to conduct inconsistency detection among multiple information sources to detect anomalies. We develop a series of tensor decomposition based algorithms for detecting inconsistent information in an unsupervised learning setting. In unsupervised learning, by representing dynamic multi-sourced data as tensors, we proposed different tensor decomposition based approaches, including an online method with theoretical guarantees for large-scale applications, to capture the common patterns across sources. An indicator of anomaly is proposed by identifying inconsistencies based on a comparison between source inputs and common patterns. The proposed frameworks have further been applied to a wide variety of applications from cybersecurity, to hotel review, and to computer networks.;To sum up, we conduct novel multi-sourced information trustworthiness analysis to discover trustworthy information or to detect untrustworthy information in this thesis. For trustworthy information discovery, the proposed reliability-aware Information Integration framework gives us a tool to identify reliable sources and discover the true information of data entities from the conflicting multi-sourced data. For untrustworthy information detection, we can detect malicious data entities which receive inconsistent information across all available data sources via the developed Inconsistency Detection approaches. The frameworks we developed have been effectively applied in many areas, including Hotel Review Analysis, Cybersecurity, and Computer Network, and have the potential of being applied to many other areas, such as Healthcare, Mobilesensing, and Crowdsourcing. With advances in technology and devices, both the amount of data and the number of sources in our world are still exploding, so there are great opportunities as well as numerous research challenges for inference of useful knowledge from multiple sources of massive data collections.

机译：在大数据时代，即使描述相同的对象或事件的数据条目也可能来自多种来源。有一些来源通常会提供准确的信息，但是由于各种原因，例如记录错误，设备故障，背景噪音以及意图操纵数据，其他一些来源可能包含嘈杂甚至错误的信息。因此，不可避免的是，来自多个来源的信息相互冲突。为了发现有用的知识，通常这些知识通常深埋在那些复杂的多源数据中，我们必须对所有可用数据源进行信息可信度分析。本文提出了一系列的多源信息可信度分析方法，包括可靠性感知信息集成和不一致性检测，以分别有效，有效地发现可信和不可信信息。这对于确定经常提供准确信息的可靠来源至关重要，因此我们可以更加关注它们的信息，以便更好地发现真相（即，可信赖的信息）。不幸的是，没有先知告诉我们哪个信息源更可靠。为了正确识别事实，在本文的第一部分中，我们开发了新颖的信息集成方法，该方法结合了对源可靠性的估计。我们在数据级和模型级信息中探索源可靠性估计的功能。目的是共同估计哪个来源可靠，哪些信息正确，其中信息可以是数据级信息集成中的原始数据，也可以是模型级信息集成中的模型参数。在这一部分中，我们通过理论分析证明了所提出方法的一些良好特性，并证明了它们对某些实际应用的影响，例如室内平面图构建和众包问答。另一方面，当跨各种信息源遇到意外分歧时，即数据实体在多个数据源之间接收到不一致的信息，这可能会引起危险，并需要进行深入调查。本文研究的第二部分是在多个信息源之间进行不一致性检测，以发现异常。我们开发了一系列基于张量分解的算法，用于在无监督的学习环境中检测不一致的信息。在无监督学习中，通过将动态的多源数据表示为张量，我们提出了不同的基于张量分解的方法，其中包括一种在线方法，该方法为大规模应用提供了理论上的保证，以捕获跨源的通用模式。通过基于源输入和通用模式之间的比较来识别不一致之处，提出了异常指标。拟议的框架已进一步应用于从网络安全，酒店点评到计算机网络的广泛应用。综上所述，我们进行了新颖的多源信息可信度分析，以发现可信信息或检测不可信信息。论文。对于可信赖的信息发现，所提出的可靠性感知信息集成框架为我们提供了一种工具，用于识别可靠的源并从冲突的多源数据中发现数据实体的真实信息。对于不可靠的信息检测，我们可以通过开发的“不一致检测”方法检测在所有可用数据源中接收不一致信息的恶意数据实体。我们开发的框架已在许多领域得到有效应用，包括酒店评论分析，网络安全和计算机网络，并且有可能被应用于许多其他领域，例如医疗保健，移动传感和众包。随着技术和设备的进步，我们世界上的数据量和来源数量仍在爆炸式增长，因此从大量海量数据集合中推断出有用的知识有很大的机会和众多的研究挑战。

著录项

作者
Xiao, Houping.;
展开▼
作者单位

State University of New York at Buffalo.;

展开▼
授予单位 State University of New York at Buffalo.;
学科 Computer science.
学位 Ph.D.
年度 2018
页码 246 p.
总页数 246
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. TextTruth: An Unsupervised Approach to Discover Trustworthy Information from Multi-Sourced Text Data [J] . Hengtong Zhang, Yaliang Li, Fenglong Ma, SIGKDD explorations . 2018,第Udisk期

机译：TextTruth：一种无监督的方法来发现来自多源文本数据的值得信赖的信息
2. Analysis of system trustworthiness based on information flow noninterference theory [J] . Kong Xiangying, Chen Yanhui, Zhuang Yi Chinese Journal of Systems Engineering and Electronics . 2015,第2期

机译：基于信息流无干扰理论的系统可信度分析
3. Analysis of system trustworthiness based on information flow noninterference theory [J] . Xiangying Kong, Yanhui Chen, Yi Zhuang 系统工程与电子技术（英文版） . 2015,第002期

机译：基于信息流非干扰理论的系统可靠性分析
4. THE TRUSTWORTHINESS OF CAUSE-RELATED MARKETING: AN ANALYSIS OF THE CRM TRIAD WITH AGENCY THEORY PERSPECTIVE [C] . Chia-ying Chang, Ghi-Feng Yen, Yi-Chung Hu AMA Winter Educators' Conference . 2012

机译：与原因相关营销的可信度：与机构理论视角的CRM三合会分析
5. Model Predictive Control techniques with application to photovoltaic, DC Microgrid, and a multi-sourced hybrid energy system. [D] . Shadmand, Mohammad Bagher. 2015

机译：模型预测控制技术在光伏，直流微电网和多源混合能源系统中的应用。
6. Judging Strangers’ Trustworthiness is Associated with Theory of Mind Skills [O] . Marie Prevost, Mathieu Brodeur, Kristine H. Onishi, 2015

机译：判断陌生人的可信度与心理技能理论相关
7. DIGITAL CONTROL SYSTEMS IMPLEMENTATION TECHNIQUES, VOLUME 70 OF CONTROL AND DYNAMIC SYSTEMS: ADVANCES IN THEORY AND APPLICATIONS, C. Leondes (ed), Academic Press, San Diego, 1995, 390 pp., ISBN 0-12-0127702, $99.00 DISCRETE-TIME CONTROL SYSTEM ANALYSIS AND DESIGN, VOLUME 71 OF CONTROL AND DYNAMIC SYSTEMS: ADVANCES IN THEORY AND APPLICATIONS, C. Leondes (ed), Academic Press, San Diego, 1995, 410 pp., ISBN 0-12-0127715, $99.00 DISCRETE-TIME CONTROL SYSTEM IMPLEMENTATION TECHNIQUES, VOLUME 72 OF CONTROL AND DYNAMIC SYSTEMS: ADVANCES IN THEORY AND APPLICATIONS, C. Leondes (ed), Academic Press, San Diego, 1995, 388 pp., ISBN 0-12-0127725, $99.00 TECHNIQUES IN DISCRETE-TIME STOCHASTIC CONTROL SYSTEMS, VOLUME 73 OF CONTROL AND DYNAMIC SYSTEMS: ADVANCES IN THEORY AND APPLICATIONS, C. Leondes (ed), Academic Press, San Diego, 1995, 380 pp., ISBN 0-12-0127734, $99.00 TECHNIQUES IN DISCRETE AND CONTINUOUS ROBUST SYSTEMS, VOLUME 74 OF CONTROL AND DYNAMIC SYSTEMS: ADVANCES IN THEORY AND APPLICATIONS, C. Leondes (ed), Academic Press, San Diego, 1995, 412 pp., ISBN 0-12-0127741, $99.00 [O] . D. SUBBARAM NAIDU 1997

机译：数字控制系统的实施技术，控制和动态系统的第70卷：理论和应用的进步，C.丝丝（ED），学术出版社，圣地亚哥，1995,390 PP，ISBN 0-12-0127702，99.00美元离散时间控制系统分析和设计，控制和动态系统的第71卷：理论和应用的进步，C.丝丝（ED），学术出版社，圣地亚哥，1995,410 PP，ISBN 0-12-0127715，99.00美元离散时间控制系统实现技术，控制和动态系统的第72卷：理论和应用的进步，C.丝丝（ED），学术出版社，圣地亚哥，1995,388 pp，ISBN 0-12-0127725，99.00美元的离散 - 时间随机控制系统，控制和动态系统的VOLUME 73：前进，理论与应用，C Leondes（ED），学术出版社，圣地亚哥，1995年，380页，ISBN 0-12-0127734，在离散和$ 99.00技术。连续稳健的系统，控制和动态系统的第74卷：理论和应用的进步，C. leondes（ed），学术出版社，圣地亚哥，1995,412 pp。，ISBN 0-12-0127741，99.00美元

Multi-sourced Information Trustworthiness Analysis: Applications and Theory

摘要

著录项

相似文献

相关主题

期刊订阅