首页> 外文OA文献 >Meta-data to enhance case-based prediction
【2h】

Meta-data to enhance case-based prediction

机译:元数据,以增强基于案例的预测

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The focus of this thesis is to measure the regularity of case bases used in Case-Based Prediction (CBP) systems and the reliability of their constituent cases prior to the system's deployment to influence user confidence on the delivered solutions. The reliability information, referred to as meta-data, is then used to enhance prediction accuracy. CBP is a strain of Case-Based Reasoning (CBR) that differs from the latter only in the solution feature which is a continuous value. Several factors make implementing such systems for prediction domains a challenge. Typically, the problem and solution spaces are unbounded in prediction problems that make it difficult to determine the portions of the domain represented by the case base. In addition, such problem domains often exhibit complex and poorly understood interactions between features and contain noise. As a result, the overall regularity in the case base is distorted which poses a hindrance to delivery of good quality solutions. Hence in this research, techniques have been presented that address the issue of irregularity in case bases with an objective to increase prediction accuracy of solutions. Although, several techniques have been proposed in the CBR literature to deal with irregular case bases, they are inapplicable to CBP problems. As an alternative, this research proposes the generation of relevant case-specific meta-data. The meta-data is made use of in Mantel's randomisation test to objectively measure regularity in the case base. Several novel visualisations using the meta-data have been presented to observe the degree of regularity and help identify suspect unreliable cases whose reuse may very likely yield poor solutions. Further, performances of individual cases are recorded to judge their reliability, which is reflected upon before selecting them for reuse along with their distance from the problem case. The intention is to overlook unreliable cases in favour of relatively distant yet more reliable ones for reuse to enhance prediction accuracy. The proposed techniques have been demonstrated on software engineering data sets where the aim is to predict the duration of a software project on the basis of past completed projects recorded in the case base. Software engineering is a human-centric, volatile and dynamic discipline where many unrecorded factors influence productivity. This degrades the regularity in case bases where cases are disproportionably spread out in the problem and solution spaces resulting in erratic prediction quality. Results from administering the proposed techniques were helpful to gain insight into the three software engineering data sets used in this analysis. The Mantel's test was very effective at measuring overall regularity within a case base, while the visualisations were learnt to be variably valuable depending upon the size of the data set. Most importantly, the proposed case discrimination system, that intended to reuse only reliable similar cases, was successful at increasing prediction accuracy for all three data sets. Thus, the contributions of this research are some novel approaches making use of meta-data to firstly provide the means to assess and visualise irregularities in case bases and cases from prediction domains and secondly, provide a method to identify unreliable cases to avoid their reuse in favour to more reliable cases to enhance overall prediction accuracy.
机译:本文的重点是在部署基于案例的预测(CBP)系统中使用的案例库的规则性及其组成案例的可靠性之前,要先进行系统部署,以影响用户对交付解决方案的信心。然后使用称为元数据的可靠性信息来增强预测准确性。 CBP是基于案例的推理(CBR)的一种形式,它与后者的区别仅在于解决方案特征(连续值)上。几个因素使针对预测域实施此类系统成为一个挑战。通常,问题和解决方案空间在预测问题中是不受限制的,这使得很难确定由案例库表示的领域部分。此外,此类问题域通常在要素之间表现出复杂且难以理解的交互并包含噪声。结果,案例库的整体规则性被扭曲,这阻碍了高质量解决方案的交付。因此,在这项研究中,已经提出了解决案例库中不规则性问题的技术,目的是提高解决方案的预测准确性。尽管在CBR文献中已经提出了几种技术来处理不规则的案例库,但是它们不适用于CBP问题。作为替代方案,本研究建议生成相关的案例特定的元数据。元数据在Mantel的随机化测试中得以利用,以客观地衡量案例库中的规律性。已经提出了使用元数据的几种新颖的可视化方法,以观察规律性程度,并帮助确定可疑的不可靠案例,这些案例的重用很可能会产生不良的解决方案。此外,记录个别案例的性能以判断其可靠性,这在选择它们进行重新使用之前会得到反映,以及它们与问题案例之间的距离。目的是忽略不可靠的情况,转而使用相对较远但更可靠的情况,以便重用以提高预测精度。所提出的技术已在软件工程数据集上得到了证明,其目的是根据案例库中记录的过去完成的项目来预测软件项目的持续时间。软件工程是一门以人为中心,易变且动态的学科,其中许多未记录的因素都会影响生产率。这会降低案例库在事件和解决方案空间中按比例分布案例的规则性,从而导致预测质量不稳定。管理提议的技术的结果有助于深入了解此分析中使用的三个软件工程数据集。壁炉架的测试对于衡量案例库中的总体规律性非常有效,而可视化效果据数据集的大小而定具有可变的价值。最重要的是,旨在仅重用可靠的类似案例的拟议的案例识别系统成功地提高了所有三个数据集的预测准确性。因此,本研究的贡献是一些利用元数据的新颖方法,首先提供了评估和可视化来自预测域的案例库和案例中违规行为的方法,其次,提供了一种识别不可靠案例的方法,从而避免了它们的重复使用。支持更可靠的案例以提高整体预测准确性。

著录项

  • 作者

    Premraj Rahul;

  • 作者单位
  • 年度 2006
  • 总页数
  • 原文格式 PDF
  • 正文语种 English
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号