您现在的位置:首页>美国卫生研究院文献>Evolutionary Bioinformatics Online

期刊信息

  • 期刊名称:

    -

  • 刊频: Annual
  • NLM标题: Evol Bioinform Online
  • iso缩写: -
  • ISSN: -

年度选择

更多>>

  • 排序:
  • 显示:
  • 每页:
全选(0
<1/20>
459条结果
  • 机译 双插入联合Indel模型下中位基因组的线性化
    摘要:Reconstruction of the median genome consisting of linear chromosomes from three given genomes is known to be intractable. There exist efficient methods for solving a relaxed version of this problem, where the median genome is allowed to have circular chromosomes. We propose a method for construction of an approximate solution to the original problem from a solution to the relaxed problem and prove a bound on its approximation error. Our method also provides insights into the combinatorial structure of genome transformations with respect to appearance of circular chromosomes.
  • 机译 核苷酸跳跃神经网络对病毒基因组进化的特征学习
    • 作者:Hyunjin Shim
    • 刊名:Evolutionary Bioinformatics Online
    • 2019年第期
    摘要:Recent studies reveal that even the smallest genomes such as viruses evolve through complex and stochastic processes, and the assumption of independent alleles is not valid in most applications. Advances in sequencing technologies produce multiple time-point whole-genome data, which enable potential interactions between these alleles to be investigated empirically. To investigate these interactions, we represent alleles as distributed vectors that encode for relationships with other alleles in the course of evolution and apply artificial neural networks to time-sampled whole-genome datasets for feature learning. We build this platform using methods and algorithms derived from natural language processing (NLP), and we denote it as the nucleotide skip-gram neural network. We learn distributed vectors of alleles using the changes in allele frequency of echovirus 11 in the presence or absence of the disinfectant (ClO2) from the experimental evolution data. Results from the training using a new open-source software TensorFlow show that the learned distributed vectors can be clustered using principal component analysis and hierarchical clustering to reveal a list of non-synonymous mutations that arise on the structural protein VP1 in connection to the candidate mutation for ClO2 adaptation. Furthermore, this method can account for recombination rates by setting the extent of interactions as a biological hyper-parameter, and the results show that the most realistic scenario of mid-range interactions across the genome is most consistent with the previous studies.
  • 机译 SAliBASE:模拟蛋白质比对数据库
    摘要:Simulated alignments are alternatives to manually constructed multiple sequence alignments for evaluating performance of multiple sequence alignment tools. The importance of simulated sequences is recognized because their true evolutionary history is known, which is very helpful for reconstructing accurate phylogenetic trees and alignments. However, generating simulated alignments require expertise to use bioinformatics tools and consume several hours for reconstructing even a few hundreds of simulated sequences. It becomes a tedious job for an end user who needs a few datasets of variety of simulated sequences. Currently, there is no databank available which may help researchers to download simulated sequences/alignments for their study. Major focus of our study was to develop a database of simulated protein sequences (SAliBASE) based on different varying parameters such as insertion rate, deletion rate, sequence length, number of sequences, and indel size. Each dataset has corresponding alignment as well. This repository is very useful for evaluating multiple alignment methods.
  • 机译 在交流中模拟动物互动节奏
    摘要:Time is one crucial dimension conveying information in animal communication. Evolution has shaped animals’ nervous systems to produce signals with temporal properties fitting their socio-ecological niches. Many quantitative models of mechanisms underlying rhythmic behaviour exist, spanning insects, crustaceans, birds, amphibians, and mammals. However, these computational and mathematical models are often presented in isolation. Here, we provide an overview of the main mathematical models employed in the study of animal rhythmic communication among conspecifics. After presenting basic definitions and mathematical formalisms, we discuss each individual model. These computational models are then compared using simulated data to uncover similarities and key differences in the underlying mechanisms found across species. Our review of the empirical literature is admittedly limited. We stress the need of using comparative computer simulations – both before and after animal experiments – to better understand animal timing in interaction. We hope this article will serve as a potential first step towards a common computational framework to describe temporal interactions in animals, including humans.
  • 机译 探索单倍型块的结构和遗传多样性保护目的的中国土猪种群
    摘要:Chinese indigenous pigs in the Taihu Lake region are well known for their high fecundity and other excellent characteristics. To better understand the characteristics of these breeds in this area as well as to provide the government and breeders the molecular basis for formulating a reasonable conservation policy, we explored the structure of haplotype blocks and genetic diversity of the 7 populations which is relevant for the management and conservation of these important genetic resources using next-generation sequencing data. In this study, a total of 131 300 single-nucleotide polymorphisms with minor allele frequencies ⩾0.05 were obtained for further analysis. In general, there are similar within-breed genetic diversities (He, Ho, Pn, Ar) among these 7 pig populations in the Taihu Lake region. Average values for the inbreeding coefficients estimates in the 7 populations are 0.110 (F1), 0.056 (F2), and 0.078 (F3). All the breeds have seen a continuous decline in Ne estimates over time with FJ and SW populations having a very similar curve. Moreover, the Ne of SMS pig breeds were smaller than other Chinese pig breeds, indicating that SMS pig breeds underwent stronger selection pressure than other Chinese pig breeds. The average genetic distances among the7 populations in the Taihu Lake region were 0.235 (MMS), 0.240 (SMS), 0.269(EH), 0.248 (MI), 0.221 (FJ), 0.254 (JX), and 0.212 (SW). A summary of thenumber of haplotype blocks and haplotype diversity was also presented. Thisstudy provide a deep understanding of the current situation of conservation inthis region, thereby uncovering the pertinent insight to better formulate morereasonable preservation policies for the government departments and breedingplanners to follow-up.
  • 机译 基于机器学习的骨小梁力学分析
    摘要:“Bone remodeling” is a dynamic process, and mutliphase analysis incorporated with the forecasting algorithm can help the biologists and orthopedics to interpret the laboratory generated results and to apply them in improving applications in the fields of “drug design, treatment, and therapy” of diseased bones. The metastasized bone microenvironment has always remained a challenging puzzle for the researchers. A multiphase computational model is interfaced with the artificial intelligence algorithm in a hybrid manner during this research. Trabecular surface remodeling is presented in this article, with the aid of video graphic footage, and the associated parametric thresholds are derived from artificial intelligence and clinical data.
  • 机译 进化生物信息学评论者:2018
    • 作者:
    • 刊名:Evolutionary Bioinformatics Online
    • 2019年第期
    摘要:
  • 机译 病毒-宿​​主相互作用的双重分析:以Ostreid疱疹病毒1和杯生牡蛎Crassostrea gigas为例
    摘要:Dual analyses of the interactions between Ostreid herpesvirus 1 (OsHV-1) and the bivalve Crassostrea gigas during infection can unveil events critical to the onset and progression of this viral disease and can provide novel strategies for mitigating and preventing oyster mortality. Among the currently used “omics” technologies, dual transcriptomics (dual RNA-seq) coupled with the analysis of viral DNA in the host tissues has greatly advanced the knowledge of genes and pathways mostly contributing to host defense responses, expression profiles of annotated and unknown OsHV-1 open reading frames (ORFs), and viral genome variability. In addition to dual RNA-seq, proteomics and metabolomics analyses have the potential to add complementary information, needed to understand how a malacoherpesvirus can redirect and exploit the vital processes of its host. This review explores our current knowledge of “omics” technologies in the study of host-pathogen interactions and highlights relevant applications of these fields of expertise to the complex case of C gigas infections by OsHV-1, which currently threaten the mollusk production sector worldwide.
  • 机译 致病性真菌Cladosporium phlei ATCC 36193的基因组测序草案确定了涉及Per醌类色素生产的新型聚酮合酶基因的候选对象。
    摘要:Cladosporium phlei, which causes purple eyespot disease, has been focused on as a source of phleichrome from the perylenequinone group of pigments. Although this agent is important in photodynamic therapy, there are no genome sequences for the species. Here, we sequenced the genome of C. phlei and reported the draft sequence. The total length of the draft genome was approximately 31.8 Mb, and 9571 genes were predicted. Phylogenetic analysis showed that Cladosporium sphaerospermum, Rachicladosporium sp., and Rachicladosporium antarcticum were closely related, and this result corresponded to the taxonomic data. In addition to the draft genome sequence, we report four candidates of new polyketide synthase (PKS) genes, involved in the production of perylenequinone-group pigments.
  • 机译 基于质量控制协变量的大豆基因组预测精度值的响应面分析
    摘要:An important and broadly used tool for selection purposes and to increase yield and genetic gain in plant breeding programs is genomic prediction (GP). Genomic prediction is a technique where molecular marker information and phenotypic data are used to predict the phenotype (eg, yield) of individuals for which only marker data are available. Higher prediction accuracy can be achieved not only by using efficient models but also by using quality molecular marker and phenotypic data. The steps of a typical quality control (QC) of marker data include the elimination of markers with certain level of minor allele frequency (MAF) and missing marker values and the imputation of missing marker values. In this article, we evaluated how the prediction accuracy is influenced by the combination of 12 MAF values, 27 different percentages of missing marker values, and 2 imputation techniques (IT; naïve and Random Forest (RF)). We constructed a response surface of prediction accuracy values for the two ITs as a function of MAF and percentage of missing marker values using soybean data from the University of Nebraska–Lincoln Soybean Breeding Program. We found that both the genetic architecture of the trait and the IT affect the prediction accuracy implying that we have to be careful how we perform QC on the marker data. For the corresponding combinations MAF-percentage of missing values we observed that implementing the RF imputation increased the number of markers by 2 to 5 times than the simple naïve imputation method that is based on the mean allele dosage of the non-missing values at each loci. We conclude that there is not a unique strategy (combination of the QCs and imputation method) that outperforms the results of the others for all traits.
  • 机译 揭示了pH调节的HIV gp120结合的分子基础
    摘要:Decades of research has yet to provide a vaccine for HIV, the virus which causes AIDS. Recent theoretical research has turned attention to mucosa pH levels over systemic pH levels. Previous research in this field developed a computational approach for determining pH sensitivity that indicated higher potential for transmission at mucosa pH levels present during intercourse. The process was extended to incorporate a principal component analysis (PCA)-based machine learning technique for classification of gp120 proteins against a known transmitted variant called Biomolecular Electro-Static Indexing (BESI). The original process has since been extended to the residue level by a process we termed Electrostatic Variance Masking (EVM) and used in conjunction with BESI to determine structural differences present among various subspecies across Clades A1 and C. Results indicate that structures outside of the core selected by EVM may be responsible for binding affinity observed in many other studies and that pH modulation of select substructures indicated by EVM may influence specific regions of the viral envelope protein (Env) involved in protein-protein interactions.
  • 机译 Makorin无名指蛋白3的进化分析显示。哺乳动物的积极选择
    摘要:Makorin ring finger proteins (MKRNs) are part the of ubiquitin-proteasome system; a complex system important for cell functions. Ubiquitin fate through proteolytic, non-proteolytic pathways varies, depending on covalent linkage between ubiquitin and protein substrates. Makorin ring finger protein 3 is an integral part of covalent linkage of ubiquitin to protein substrates. Similar to others imprinted genes, MKRN3 also evolve under positive selection; however, which codons are specifically selected in MKRN3 during evolution are needed to be explored. Different maximum-likelihood (ML) codon-based methodologies were used to ascertain positive selection signatures in 22 mammalian sequences of MKRN3 to probe an individual codon for positive selection signatures. By applying the HyPhy software package implemented in the Data Monkey Web Server and CODEML implemented in PAML, evolutionary analysis based on two Ml frameworks were conducted. The analysis was executed by comparing M1a against M2a, M7 against M8, and PAML models and 2∆Lnl (LRT) was resulted by likelihood logs. M1a contributed ω1 (dN/dS) with LRT value (∆Lnl) 12.01, and positive selection was found in M2a with ω3 = 2.23603. To further improve selection test,M8 was compared to M7 with 2∆Lnl (LRT) 30.17,and M8 showed positive selection with ω = 1.55759. The data were fit to M8 thanM7, which suggests that M8 was the most significant model of selection. M8 wasjudged encouraging for this analysis and used to establish a positive selectionof MKRN3 proteins. We found Gly312 as a positively selected amino acid in a zincfinger motif/Really Interesting New Gene (RING) finger motif; the former ones’region is involved in RNA binding and the later ones in ubiquitin ligaseactivity of the protein, vital for protein function. Selection analyses of MKRNsmight advance the developments in unique approaches that could lead to geneticprogress over the selection of superior individuals with the breeding valueshigher for certain traits as ancestries to get the next generation.
  • 机译 埋葬密码子偏见的计算机分析龙脑香科物种的分子和系统发育数据
    摘要:Introduction:DNA barcode, a molecular marker, is used to distinguish among the closely related species, and it can be applied across a broad range of taxa to understand ecology and evolution. MaturaseK gene (matK) and rubisco bisphosphate carboxylase/oxygenase form I gene (rbcL) of the chloroplast are highly conserved in a plant system, which are used as core barcode. This present endeavor entails the comprehensive examination of the under threat plant species based on success of discrimination on DNA barcode under selection pressure.
  • 机译 基于序列的转录组全靶向基因分型进化与生态学
    摘要:Transcriptome-wide targeted genotyping is highly attractive for evolutionary and ecological studies but, until recently, accomplishing this goal presented a major technical barrier for the study of non-model organisms. Our group has recently developed a high-throughput targeted genotyping approach (called HD-Marker) based on the high specificity and accuracy of oligo extension-ligation assays that facilitates the design of assays tailored to meet specific genotyping needs. HD-Marker allows for targeted genotyping of over 10 000 genes in a single tube, with strikingly high capture rate (98%-99%) and genotyping accuracy (97%-99%). With the remarkable advantages of cost-effectiveness and flexibility, we envision that HD-Marker has broad application potential in evolutionary and ecological studies.
  • 机译 矢量量化谱聚类应用于植物全基因组序列
    摘要:We develop a Vector Quantized Spectral Clustering (VQSC) algorithm that is a combination of spectral clustering (SC) and vector quantization (VQ) sampling for grouping genome sequences of plants. The inspiration here is to use SC for its accuracy and VQ to make the algorithm computationally cheap (the complexity of SC is cubic in terms of the input size). Although the combination of SC and VQ is not new, the novelty of our work is in developing the crucial similarity matrix in SC as well as use of k-medoids in VQ, both adapted for the plant genome data. For Soybean, we compare our approach with commonly used techniques like Un-weighted Pair Graph Method with Arithmetic mean (UPGMA) and Neighbor Joining (NJ). Experimental results show that our VQSC outperforms both these techniques significantly in terms of cluster quality (average improvement of 21% over UPGMA and 24% over NJ) as well as time complexity (order of magnitude faster than both UPGMA and NJ).
  • 机译 一种基于路径的策略,用于识别肺癌诊断和预后的生物标志物。
    摘要:Current research has identified several potential biomarkers for lung cancer diagnosis or prognosis. However, most of these biomarkers are derived from a relatively small number of samples using algorithms at the gene level. Hence, gene expression signatures discovered in these studies have little overlaps. In this study, we proposed a new strategy to identify biomarkers from multiple datasets at the pathway level. We integrated the genome-wide expression data of lung cancer tissues from 13 published studies and applied our strategy to identify lung cancer diagnostic and prognostic biomarkers. We identified a 32-gene signature that differentiates lung adenocarcinomas from other lung cancer subtypes. We also discovered a 43-gene signature that can predict the outcome of human lung cancers. We tested their performance in several independent cohorts, which confirmed their robust prognostic and diagnostic power. Furthermore, we showed that the proposed gene expression signatures were independent of several traditional clinical indicators in lung cancer management. Our results suggest that the pathway-based strategy is useful to identify transcriptomic biomarkers from large-scale gene expression datasets that were collected from multiple sources.
  • 机译 代谢谱分析法鉴定草莓的潜在感染灰葡萄孢
    摘要:In plant-pathogen interaction systems, plant metabolism is usually agitated in the early stages of infection and much before visible symptoms appear. To identify the latent infection of strawberry by Botrytis cinerea by metabolome profiling, a metabolomics method based on gas chromatography and mass spectrometry was applied to identify the affected metabolites and discriminate diseased plants from healthy ones. An orthogonal partial least squares (OPLS) score plot showed that the metabolic profiling well separated B. cinerea-infected strawberry plants at 2, 5, and 7 days after infection from non-infected healthy plants. Combined analysis of variance (ANOVA) and OPLS analysis revealed candidate biomarkers of plant resistance and of infection and expansion of the pathogen in the plants. Among them, hexadecanoic acid, octadecanoic acid, sucrose, β-lyxopyranose, melibiose, and 1,1,4a-Trimethyl-5,6-dimethylenedecahydronaphthalene were closely related to the early stage of disease development when symptoms were not visible. A discrimination method that could distinguish Botrytis gray mold diseased strawberry plants from healthy ones was established based on the partial least squares discriminant analysis (PLS-DA) model with a correctrecognition accuracy of 100%. This research offers a good application ofmetabolome profiling for early diagnosis of plant disease and interactionmechanism exploration.
  • 机译 转录组测序分析可深入了解反应百合中尖孢镰刀菌的分离铝
    摘要:Lily basal rot, caused by Fusarium oxysporum f. sp. lilii, is one of the most serious diseases of lily. Although the lily germplasm which is resistant to F. oxysporum has been used in disease-resistant breeding, few studies on its molecular mechanism of disease resistance have been reported. To comprehensively study the mechanism of resistance to F. oxysporum, transcriptome sequencings of root tissues from Lilium pumilum inoculated with F. oxysporum or sterile water for 6, 12, or 24 h were performed. A total of 50 GB of data were obtained from the transcriptome sequencings of the 6 L. pumilum samples, and 217 098 Unigenes were obtained after the de novo assembly, of which 38.36% Unigenes were annotated. The sequencing results showed that the numbers of differentially expressed genes at 6, 12, and 24 h after inoculation compared with the control were 111, 254, and 2500, respectively. The functional enrichment analysis of the differentially expressed genes showed that several pathways were involved in responses of L. pumilum, mainly including starch and sucrose metabolism, glycolysis/gluconeogenesis, phenylpropanoid biosynthesis, planthormone signal transduction, flavonoid biosynthesis, vitamin B6 (VB6)biosynthesis, acid biosynthesis, proteasome, and ribosome. Transcription factoranalysis revealed that the WRKY and ERF families played important roles inresponses of L. pumilum to F. oxysporum. Theresults of this study elucidate the molecular responses to F.oxysporum in lily and lay a theoretical foundation for improvinglily breeding and strategies for lily basal rot resistance.
  • 机译 使用重采样的经济高效的极端案例控制设计方法
    摘要:Nested case-control sampling design is a popular method in a cohort study whose events are often rare. The controls are randomly selected with or without the matching variable fully observed across all cohort samples to control confounding factors. In this article, we propose a new nested case-control sampling design incorporating both extreme case-control design and a resampling technique. This new algorithm has two main advantages with respect to the conventional nested case-control design. First, it inherits the strength of extreme case-control design such that it does not require the risk sets in each event time to be specified. Second, the target number of controls can only be determined by the budget and time constraints and the resampling method allows an under sampling design, which means that the total number of sampled controls can be smaller than the number of cases. A simulation study demonstrated that the proposed algorithm performs well even when we have a smaller number of controls compared with the number of cases. The proposed sampling algorithm is applied to a public data collected for “Thorotrast Study.”
  • 机译 基于混合模型的大豆冠层覆盖图像和基因型信息的基因组预测
    摘要:Prediction techniques are important in plant breeding as they provide a tool for selection that is more efficient and economical than traditional phenotypic and pedigree based selection. The conventional genomic prediction models include molecular marker information to predict the phenotype. With the development of new phenomics techniques we have the opportunity to collect image data on the plants, and extend the traditional genomic prediction models where we incorporate diverse set of information collected on the plants. In our research, we developed a hybrid matrix model that incorporates molecular marker and canopy coverage information as a weighted linear combination to predict grain yield for the soybean nested association mapping (SoyNAM) panel. To obtain the testing and training sets, we clustered the individuals based on their marker and canopy information using 2 different clustering techniques, and we compared 5 different cross-validation schemes. The results showed that the predictive ability of the models was the highest when both the canopy and marker information was included, and it was the lowest when only the canopy information was included.

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号