首页> 外文会议>IEEE International Conference on Machine Learning and Applications >Reordering Genomic Sequences for Enhanced Classification via Compression Analytics
【24h】

Reordering Genomic Sequences for Enhanced Classification via Compression Analytics

机译:通过压缩分析对基因组序列进行重新排序以增强分类

获取原文

摘要

The full implications of sharing genomic information are still largely unknown. Understanding what attributes can be inferred from available information is therefore a critical part of genomic privacy and security. We show that compression analytics are successful at classifying, or inferring, unknown attributes of genomic sequences without the need for a predefined feature set and with very little training data. Compression analytics perform best when predictable elements within a sequence are local; however, long range dependencies are ubiquitous in the human genome. We therefore consider a variety of schemes to reorder genomic sequences so as to localize predictable elements and improve the performance of compression analytics. Compression analytics on both native and reordered sequences are shown to outperform more traditional, feature-based machine learning approaches.
机译:共享基因组信息的全部含义仍是未知之数。因此,了解可以从可用信息中推断出哪些属性是基因组隐私和安全性的关键部分。我们表明,压缩分析可以成功地对基因组序列的未知属性进行分类或推断,而无需预定义的功能集并且只需很少的训练数据。当序列中的可预测元素是局部的时,压缩分析的效果最佳。然而,远距离依赖性在人类基因组中无处不在。因此,我们考虑了多种方案来对基因组序列进行重新排序,以便定位可预测的元素并提高压缩分析的性能。本地和重排序序列上的压缩分析均显示出优于传统的,基于特征的机器学习方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号