首页> 外文期刊>BMC Bioinformatics >An Automated Method for Rapid Identification of Putative Gene Family Members in Plants
【24h】

An Automated Method for Rapid Identification of Putative Gene Family Members in Plants

机译:一种快速鉴定植物推定基因家族成员的自动化方法

获取原文
           

摘要

Background Gene duplication events have played a significant role in genome evolution, particularly in plants. Exhaustive searches for all members of a known gene family as well as the identification of new gene families has become increasingly important. Subfunctionalization via changes in regulatory sequences following duplication (adaptive selection) appears to be a common mechanism of evolution in plants and can be accompanied by purifying selection on the coding region. Such negative selection can be detected by a bias toward synonymous over nonsynonymous substitutions. However, the process of identifying this bias requires many steps usually employing several different software programs. We have simplified the process and significantly shortened the time required by condensing many steps into a few scripts or programs to rapidly identify putative gene family members beginning with a single query sequence. Results In this report we 1) describe the software tools (SimESTs, PCAT, and SCAT) developed to automate the gene family identification, 2) demonstrate the validity of the method by correctly identifying 3 of 4 PAL gene family members from Arabidopsis using EST data alone, 3) identify 2 to 6 CAD gene family members from Glycine max (previously unidentified), and 4) identify 2 members of a putative Glycine max gene family previously unidentified in any plant species. Conclusion Gene families in plants, particularly that subset where purifying selection has occurred in the coding region, can be identified quickly and easily by integrating our software tools and commonly available contig assembly and ORF identification programs.
机译:背景基因复制事件在基因组进化中,特别是在植物中,发挥了重要作用。详尽搜索已知基因家族的所有成员以及鉴定新基因家族变得越来越重要。通过复制(适应性选择)后调控序列变化而引起的亚功能化似乎是植物进化的常见机制,并且可以伴随着在编码区上纯化选择。可以通过偏向同义而不是非同义替换来检测这种否定选择。但是,识别这种偏差的过程需要许多步骤,通常采用几种不同的软件程序。我们通过将许多步骤压缩为几个脚本或程序来快速识别推定的基因家族成员(从单个查询序列开始),从而简化了过程并大大缩短了所需时间。结果在本报告中,我们1)描述了开发用于自动进行基因家族鉴定的软件工具(SimESTs,PCAT和SCAT),2)通过使用EST数据正确地从拟南芥中鉴定4个PAL基因家族成员中的3个,证明了该方法的有效性。仅3)从Glycine max(以前未鉴定)中鉴定2至6个CAD基因家族成员,以及4)从任何植物物种中鉴定先前未鉴定的推定的Glycine max基因家族2个成员。结论通过整合我们的软件工具以及常用的重叠群装配和ORF识别程序,可以快速,轻松地识别植物中的基因家族,特别是在编码区域中进行了纯化选择的子集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号