您现在的位置:首页>美国卫生研究院文献>BMC Bioinformatics

期刊信息

  • 期刊名称:

    -

  • 刊频:
  • NLM标题:
  • iso缩写: -
  • ISSN: -
  • 排序:
  • 显示:
  • 每页:
全选(0
<4/20>
9878条结果
  • 机译 DeSignate:检测基因序列比对中的签名字符以进行分类分类诊断
    摘要:Historically, taxonomic diagnoses are restricted to morphological characters distinguishing a particular taxon (the query group) from related taxa (the reference group). Best practice for taxonomic studies suggests an integrative approach combining morphological, molecular, ecological, and physiological data [ – ]. Previous suggestions for applying divergence cut-off values of gene sequences to discriminate and define taxa (threshold-based approach), however, are based on the overall dissimilarity and are not character-based, i.e., do not use distinct molecular characters for separation and characterization [ , ]. In the character-based approach, each position of an alignment represents a molecular character which may adopt different states in gene sequence data (e.g., nucleotides and deletions). Diagnostic molecular characters are included in taxon diagnoses (e.g., of protists [ ] or animals [ , – ]). However, data from (potentially) related taxa for comparison with the type species are often lacking or difficult to obtain [ ]. Furthermore, available data is frequently not added consistently to formal diagnoses [ , ], due to problems in, for instance, the definition of diagnostic molecular characters and the designation of their positions, as well as the lack of suitable tools. For a standardized designation of the position of diagnostic molecular characters in taxon diagnoses, a reference sequence alignment and/or a reference sequence are recommended, facilitating comparability and reproducibility [ ].
  • 机译 GiniClust3:一种用于识别稀有细胞类型的快速且高效存储的工具
    摘要:Analysis of mouse brain dataset with more than one million cells. An overview of the GiniClust3 pipeline. Input single-cell expression matrix is clustered based on features selected by Gini index (GiniIndexClust) and by Fano factor (FanoFactorClust), respectively. The results are then integrated using a cluster-aware, weighted consensus clustering algorithm (ConsensusClust). UMAP visualization of the gene expression patterns based on Fano-factor (top) and Gini index (bottom) selected features, respectively. Consensus clustering results are indicated by different colors. The proportion of rare cell cluster in entire population. Heatmap of cell type mapping of common and rare clusters from scMCA analysis. Bar plot in the top indicates the cell number for each cluster
  • 机译 SpatialCPie:用于空间转录组学聚类评估的R / Bioconductor程序包
    摘要:
  • 机译 circRNAprofiler:基于R的环状RNA下游分析计算框架
    摘要:Schematic representation of the circRNA analysis workflow implemented by circRNAprofiler. The grey boxes represent the 15 modules described in the text with the main R-functions reported in italics. The different type of sequences that can be selected are depicted in the dashed box. BSJ, Back-Spliced Junction
  • 机译 YSMR:用于细菌运动的视频跟踪和分析程序
    摘要:From original frame to final detection. : the original frame from the video file. : the frame is converted to grey scale. : a Gaussian blur with a 3 by 3 kernel is applied. : the adaptive threshold is applied, leaving white areas as potential bacteria. : a second, higher, adaptive threshold is applied to generate markers. : white areas from which contain markers from E are used as outlines. : each area is encased in a rectangle and assigned a unique ID, displayed on the original frame
  • 机译 vivaGen –用于软件测试的生存数据集生成器
    摘要:Software testing is an essential part of the software development process [ ]. The testing process is usually divided into different consecutive test levels: function test (on function/method level), module test, integration test, and system test. The higher the test level, the higher is the need for more complex test data. Especially, medical and clinical systems require a systematic and exceptionally intensive testing process, before such systems can be considered to assist physicians in diagnosis or treatment of patients. Although these systems could and should be tested using real-world data, there are sometimes disadvantages for this approach: real-world data can possibly be influenced or interfered by effects such as confounders [ ] or mediators [ ]. These effects are non-random and in many cases unknown. As a consequence, they are difficult to control and must be taken into account when dealing with real-world data sets for testing purposes. In addition, real-world data might contain gaps, making some tests hard to perform.
  • 机译 wg-blimp:用于全基因组亚硫酸氢盐测序数据的端到端分析管道
    摘要:Since the development of DNA sequencing, a large number of studies on genetic variation have been conducted, while extensive research on the epigenetic level has only emerged in the recent past. Although most cells within an organism are identical in their genomic sequence, different tissues and cell types vary in their patterns of epigenetic modifications that confer their particular identity. DNA methylation is one of the most important epigenetic marks and occurs mainly at CpG dinucleotides. There are almost 28 million of such sites in the human genome, thus 450k arrays (which cover only 1.6% of all CpGs) are not sufficient to detect small differentially methylated regions (DMRs) [ ]. As a result, data-intensive whole genome bisulfite sequencing (WGBS) is required to properly identify all CpG methylation levels. While the costs for generating these datasets have been very high, the continuous and sustained reduction of sequencing costs allows more and more WGBS datasets to be generated, creating the need for comprehensive and reproducible analysis tools. Many algorithms have already been established for different aspects of WGBS analyses such as alignment and DMR detection. However, choosing appropriate algorithms and integrating them into an end-to-end analysis workflow is not a trivial task due to combinatorial explosion of possible pipeline setups. Setting up an end-to-end WGBS analysis workflow is further hindered by different requirements of interacting tools, e.g. input and output formats or chromosome naming conventions. Previously developed end-to-end pipelines already consider these problems and only require users to supply their raw data and configuration. However, we find previous approaches to lack features required in common research settings, e.g. methylome segmentation, as well as technical limitations such as installation issues, as described in more detail in the “ ” section. As a result, we developed a pipeline approach to address these issues.
  • 机译 用于RNA-Seq数据分析的负二项式加性模型
    摘要:In recent years, RNA-Seq experiments have become the state-of-the-art method for quantifying mRNAs levels by measuring gene expression digitally in biological samples. An RNA-Seq experiment usually starts with isolating RNA sequences from biological samples using the Illumina Genome Analyzer, a commonly used platform for high-throughput sequencing data. These mRNA sequences are reverse transcribed into cDNA fragments. To reduce the sequencing cost and increase the speed of reading the cDNA fragments (typically a few thousands bp), these fragments are sheared into short reads (50-450 bp). These reads are mapped back to the original reference genomes/transcriptomes and the number of read counts mapping to each gene/transcript region are computed. RNA-Seq experiments are usually summarized as a count table with each row representing a gene/transcript and each column representing a sample.
  • 机译 Blast2Fish:基于参考的注释网络工具,用于非模型硬骨鱼的转录组分析
    摘要:Workflow outline of Blast2Fish. This illustration briefs the workflow of Blast2Fish. The dotted line frame means the Blast2Fish system. The pipeline starts from the gene sequences file. Blast2Fish outputs MeSH term, taxonomic distribution and reference resource
  • 机译 MEPHAS:使用R和Shiny进行医学和药物统计分析的交互式图形用户界面
    摘要:The layout of MEPHAS in the interface of
  • 机译 TAMA:通过荟萃分析改善宏基因组序列分类
    摘要:Overview of TAMA. In the read preprocessing step, low-quality input metagenome reads (single- or paired-end) are eliminated. Integrated database with identical set of reference genomes is also created. Initial taxonomy classification results, which has assigned taxon IDs for each read sequence, are generated by using CLARK, Kraken, and Centrifuge with the integrated database. In the meta-analysis process, results from the three tools are calibrated and integrated to produce a read classification and relative species abundance profile. The relative species abundance profile is generated only when the target taxonomic rank is species
  • 机译 ICEKAT:一种在线交互式工具,可根据连续的酶动力学轨迹计算初始速率
    摘要:Continuous enzyme kinetic assays allow rapid acquisition of large amounts of kinetic traces. Therefore, data analysis often becomes the bottleneck of high-throughput enzyme kinetic assays. In cases where IC /EC values or the Michaelis-Menten parameters (or ) and are of principle interest, reduction of kinetic traces to initial rates avoids error arising from assumptions involved in analyzing the entire kinetic trace [ ]. The two primary methods for determining initial rates from kinetic traces are estimation of the early linear portion of the curve and methods using integrated forms of kinetic equations [ – ]. Currently available programs such as FITSIM [ ], DYNAFIT [ ], ENZO [ ], PCAT [ ], and KinTek offer sophisticated routines for fitting kinetic traces. These programs are useful for selecting among complex enzymatic models and analyzing experiments carried out under conditions that may not satisfy the assumptions associated with Michaelis-Menten kinetics [ , ], for example measuring catalysis inside cells. However, the additional complexity offered by these programs is often not required when analyzing in vitro experiments, making them inefficient and unnecessarily complicated for many continuous enzyme kinetic applications.
  • 机译 使用SemRep进行广泛的生物医学关系提取
    摘要:A massive amount of biomedical knowledge is buried in free text, including scientific publications and clinical narratives. Natural language processing (NLP) techniques are increasingly used to extract from free text biomedical concepts, such as disorders, medications, tests, and genes/proteins, as well as relationships between them, including disease treatments, protein/drug interactions, and adverse drug events. Such techniques transform unstructured text into computable semantic representations, which can in turn support biomedical knowledge management and discovery applications, allowing clinicians and bench scientists to more efficiently access information and generate new knowledge.
  • 机译 CIPR:基于Web的R / shiny应用程序和R包,用于在单细胞RNA测序实验中注释细胞簇
    摘要:Summary of the reference datasets included in CIPR
  • 机译 fcScan:使用基因组坐标对位点组合进行聚类的多功能工具
    摘要:fcScan search strategy and performance. Left: Example of data input that can be one of GRanges, data frame or BED/VCF files. Right: schematic representation of a search example for clusters containing a combination of 2 heterotypic genomic features (orange circles and green squares) and excluding a third genomic feature (yellow triangle) in a window size of 200 bp. Sites within identified clusters must obey the order and orientation defined. Bottom: Function call on the data represented in the upper left corner with its corresponding output. Only one cluster, marked by the “correct” sign, is called based on the criteria above. All remaining clusters are eliminated for the reason described above each case
  • 机译 PACVr:R中的质体组装覆盖率可视化
    摘要:The sequencing and comparison of complete plastid genomes has become a popular method in plant evolutionary research, rendering the precise genome assembly and its quality assessment of high importance. The plastid genomes of most photosynthetically active land plants display a circular, quadripartite structure and comprise two single copy (SC) regions separated by two identical inverted repeats (IR) [ ]. A total of four partitions with markedly different lengths can, thus, be defined in typical land plant plastomes: the large single copy (LSC) region of ca. 70-90 kilobases (kb), the small single copy (SSC) region of ca. 15-25 kb, and the two IR regions (IRa and IRb) of ca. 20-25 kb each [ ]. The IR regions represent reverse complements of each other and are primarily homogenized through a recombination-mediated replication process [ , ]. The plastid genomes of most photosynthetically active land plants encode a total of ca. 100-120 proteins, which play a central role in organelle metabolism and photosynthesis [ ]. Due to their strong structural conservation, uniparental inheritance, a near absence of recombination, and a high copy number per plant cell, plastid genomes are highly suitable for comparative genomic studies [ ]. Numerous investigations have sequenced and compared complete plastid genome sequences over the past decade [ , ], and the number of publicly available plastid genomes continues to increase dramatically [ ]. Recent studies on plastid genome structure and evolution have evaluated polymorphisms across hundreds [ – ] or even thousands [ , ] of plastid genome sequences, rendering the precise assembly process of plastid genomes and their quality assessment ever more important.
  • 机译 基于替代矩阵的配色方案,用于序列比对可视化
    摘要:Typically, visualization of multiple protein sequence alignments colors the amino acid symbols according to some kind of (chemical) property. Examples for software using this visualization technique are [ ], [ ] or [ ]. Figure  shows an alignment using the default color scheme depicting the chemical characteristics of the amino acids. Typically, such a color scheme is created manually by a professional using their intuition and knowledge about any characteristics to be emphasized.
  • 机译 VADR:验证和注释病毒序列提交给GenBank
    摘要:As of September 2019, GenBank [ ] contained more than 3 million viral sequences totaling over 4 billion nucleotides in length and including over 180,000 complete genomes for viruses other than influenza. More than 250,000 of these sequences were submitted in 2018. All sequence submissions are validated prior to deposition in GenBank. Automated validation and annotation methods become increasingly important as sequence submission numbers grow.
  • 机译 ARTDeco:自动通读转录检测
    摘要:ARTDeco evaluates different aspects of readthrough transcription. Schematic diagram of typical transcription termination (top) and readthrough transcription (bottom). Total RNA-seq, RNA polymerase II ChIP-seq, and H3K27ac ChIP-seq data at locus. Normalized read coverage ranges are indicated on the right and signals exceeding these levels may be clipped (e.g. RNA-seq coverage on the exons of ). represents a primary induction gene while , , and represent read-in genes. Schematic depicting the regions used to quantify read-in levels, readthrough levels, and DoG transcript discovery for each gene

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号