k-link EST clustering: evaluating error introduced by chimeric sequences under different degrees of linkage

Bragg LM; Stone G

首页> 外文期刊>Bioinformatics >k-link EST clustering: evaluating error introduced by chimeric sequences under different degrees of linkage

【24h】

k-link EST clustering: evaluating error introduced by chimeric sequences under different degrees of linkage

机译：k-link EST聚类：评估不同连锁度下嵌合序列引入的错误

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

MOTIVATION: The clustering of expressed sequence tags (ESTs) is a crucial step in many sequence analysis studies that require a high level of redundancy. Chimeric sequences, while uncommon, can make achieving the optimal EST clustering a challenge. Single-linkage algorithms are particularly vulnerable to the effects of chimeras. To avoid chimera-facilitated erroneous merges, researchers using single-linkage algorithms are forced to use stringent sequence-similarity thresholds. Such thresholds reduce the sensitivity of the clustering algorithm. RESULTS: We introduce the concept of k-link clustering for EST data. We evaluate how clustering error rates vary over a range of linkage thresholds. Using k-link, we show that Type II error decreases in response to increasing the number of shared ESTs (ie. links) required. We observe a base level of Type II error likely caused by the presence of unmasked low-complexity or repetitive sequence. We find that Type I error increases gradually with increased linkage. To minimize the Type I error introduced by increased linkage requirements, we propose an extension to k-link which modifies the required number of links with respect to the size of clusters being compared. AVAILABILITY: The implementation of k-link is available under the terms of the GPL from http://www.bioinformatics.csiro.au/products.shtml. k-link is licensed under the GNU General Public License, and can be downloaded from http://www.bioinformatics.csiro.au/products.shtml. k-link is written in C++.

机译：动机：表达序列标签（EST）的聚类是许多需要高度冗余的序列分析研究中的关键步骤。嵌合序列虽然不常见，但却使实现最佳EST聚类成为一个挑战。单链接算法特别容易受到嵌合体的影响。为避免嵌合体促成的错误合并，使用单链接算法的研究人员被迫使用严格的序列相似性阈值。这样的阈值降低了聚类算法的敏感性。结果：我们介绍了EST数据的k链接聚类的概念。我们评估聚类错误率如何在一系列链接阈值范围内变化。使用k-link，我们表明II类错误随着增加所需的共享EST（即链接）的数量而减少。我们观察到II型错误的基本水平可能是由未掩盖的低复杂性或重复序列的存在引起的。我们发现随着链接的增加，I类错误逐渐增加。为了最大程度地减少因链接需求增加而导致的I类错误，我们建议对k-link进行扩展，以针对要比较的集群大小修改所需的链接数。可用性：在GPL的条款下，可以从http://www.bioinformatics.csiro.au/products.shtml获得k-link的实现。 k-link已获得GNU通用公共许可证的许可，可以从http://www.bioinformatics.csiro.au/products.shtml下载。 k-link用C ++编写。

著录项

来源
《Bioinformatics》 |2009年第18期|共7页
作者
Bragg LM; Stone G;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类生物工程学（生物技术）;
关键词
Algorithms; Bioinformatics; Chimeras; Data Algorithms; Bioinformatics; Chimeras; Data processing; expressed sequence tags;

机译：算法;生物信息学;嵌合体;数据算法;生物信息学;嵌合体;数据处理;表达序列标签;

相似文献

外文文献
中文文献
专利

1. k-link EST clustering: evaluating error introduced by chimeric sequences under different degrees of linkage [J] . Bragg LM, Stone G Bioinformatics . 2009,第18期

机译：k-link EST聚类：评估不同连锁度下嵌合序列引入的错误
2. Evaluation of nearest-neighbor methods for detection of chimeric small-subunit rRNA sequences. [J] . J F Robison-Cox, M M Bateson, D M Ward Applied and Environmental Microbiology . 1995,第4期

机译：评价用于检测嵌合小亚基rRNA序列的最近邻方法。
3. PCR amplification introduces errors into mononucleotide and dinucleotide repeat sequences [J] . L.A.Clarke, C.S.Rebelo, J.Goncalves Molecular Pathology . 2001,第5期

机译：PCR扩增将错误引入单核苷酸和二核苷酸重复序列
4. Graph Degree Linkage: Agglomerative Clustering on a Directed Graph [C] . Wei Zhang, Xiaogang Wang, Deli Zhao, European conference on computer vision . 2012

机译：图度链接：有向图上的聚集聚类
5. A study of introduced clones of sweet orange (Citrus sinensis) and postharvest degreening of 'Valencia Late' oranges in Kenya. [D] . Kiuru, Paul David Ngugi. 1994

机译：对肯尼亚引进的甜橙（柑桔）克隆和“巴伦西亚晚熟”橙采后等级的研究。
6. k-link EST clustering: evaluating error introduced by chimeric sequences under different degrees of linkage [O] . Lauren M. Bragg, Glenn Stone -1

机译：k-link EST聚类：评估在不同连锁度下嵌合序列引入的错误
7. k-link EST Clustering: evaluating error introduced by chimeric sequences under different degrees of linkage [O] . Lauren M. Bragg, Glenn Stone 2016

机译：k-link EsT聚类：评估嵌合序列在不同连锁程度下引入的错误
8. Terrestrial-Aquatic Linkages, Component Interactions and Potential Restriction of Growth of the Introduced Aquatic Weed 'Myriophyllum spicatum' L. (Watermilfoil) I [R] . Amundsen, C. C. , Brenkert, A. , Bruner, M. C. 1980

机译：陆生 - 水生连接，组分相互作用和引入的水生杂草'myriophyllum spicatum'L。（Watermilfoil）I的生长潜在限制

k-link EST clustering: evaluating error introduced by chimeric sequences under different degrees of linkage

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅