首页> 外文期刊>Communications and Network >Threshold Selection Study on Fisher Discriminant Analysis Used in Exon Prediction for Unbalanced Data Sets
【24h】

Threshold Selection Study on Fisher Discriminant Analysis Used in Exon Prediction for Unbalanced Data Sets

机译:用于不平衡数据集外显子预测的Fisher判别分析阈值选择研究

获取原文
           

摘要

In gene prediction, the Fisher discriminant analysis (FDA) is used to separate protein coding region (exon) from non-coding regions (intron). Usually, the positive data set and the negative data set are of the same size if the number of the data is big enough. But for some situations the data are not sufficient or not equal, the threshold used in FDA may have important influence on prediction results. This paper presents a study on the selection of the threshold. The eigen value of each exon/intron sequence is computed using the Z-curve method with 69 variables. The experiments results suggest that the size and the standard deviation of the data sets and the threshold are the three key elements to be taken into consideration to improve the prediction results.
机译:在基因预测中,使用Fisher判别分析(FDA)将蛋白质编码区(外显子)与非编码区(内含子)分开。通常,如果数据数量足够大,则正数据集和负数据集的大小相同。但是在某些情况下,数据不足或不相等,FDA使用的阈值可能会对预测结果产生重要影响。本文提出了关于阈值选择的研究。每个外显子/内含子序列的特征值是使用Z曲线方法计算的,具有69个变量。实验结果表明,数据集的大小,标准差和阈值是改善预测结果要考虑的三个关键因素。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号