首页> 美国卫生研究院文献>PLoS Genetics >Deep sequencing of HBV pre-S region reveals high heterogeneity of HBV genotypes and associations of word pattern frequencies with HCC

【2h】

Deep sequencing of HBV pre-S region reveals high heterogeneity of HBV genotypes and associations of word pattern frequencies with HCC

机译：HBV pre-S区的深度测序揭示了HBV基因型的高度异质性以及单词模式频率与HCC的关联

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Hepatitis B virus (HBV) infection is a common problem in the world, especially in China. More than 60–80% of hepatocellular carcinoma (HCC) cases can be attributed to HBV infection in high HBV prevalent regions. Although traditional Sanger sequencing has been extensively used to investigate HBV sequences, NGS is becoming more commonly used. Further, it is unknown whether word pattern frequencies of HBV reads by Next Generation Sequencing (NGS) can be used to investigate HBV genotypes and predict HCC status. In this study, we used NGS to sequence the pre-S region of the HBV sequence of 94 HCC patients and 45 chronic HBV (CHB) infected individuals. Word pattern frequencies among the sequence data of all individuals were calculated and compared using the Manhattan distance. The individuals were grouped using principal coordinate analysis (PCoA) and hierarchical clustering. Word pattern frequencies were also used to build prediction models for HCC status using both K-nearest neighbors (KNN) and support vector machine (SVM). We showed the extremely high power of analyzing HBV sequences using word patterns. Our key findings include that the first principal coordinate of the PCoA analysis was highly associated with the fraction of genotype B (or C) sequences and the second principal coordinate was significantly associated with the probability of having HCC. Hierarchical clustering first groups the individuals according to their major genotypes followed by their HCC status. Using cross-validation, high area under the receiver operational characteristic curve (AUC) of around 0.88 for KNN and 0.92 for SVM were obtained. In the independent data set of 46 HCC patients and 31 CHB individuals, a good AUC score of 0.77 was obtained using SVM. It was further shown that 3000 reads for each individual can yield stable prediction results for SVM. Thus, another key finding is that word patterns can be used to predict HCC status with high accuracy. Therefore, our study shows clearly that word pattern frequencies of HBV sequences contain much information about the composition of different HBV genotypes and the HCC status of an individual.

机译：乙型肝炎病毒（HBV）感染是世界上普遍存在的问题，尤其是在中国。超过60–80％的肝细胞癌（HCC）病例可归因于HBV高发地区的HBV感染。尽管传统的Sanger测序已被广泛用于研究HBV序列，但NGS变得越来越普遍。此外，尚不知道下一代测序（NGS）读取的HBV的字型频率是否可用于研究HBV基因型并预测HCC状态。在这项研究中，我们使用NGS对94例HCC患者和45例慢性HBV（CHB）感染者的HBV序列的pre-S区进行测序。使用曼哈顿距离，计算并比较所有个体的序列数据中的单词模式频率。使用主坐标分析（PCoA）和层次聚类对个体进行分组。单词模式频率还用于使用K最近邻（KNN）和支持向量机（SVM）建立HCC状态的预测模型。我们展示了使用单词模式分析HBV序列的强大功能。我们的主要发现包括PCoA分析的第一主坐标与基因型B（或C）序列的比例高度相关，第二主坐标与发生HCC的可能性显着相关。层次聚类首先根据个体的主要基因型将个体分组，然后再根据其HCC状况进行分组。使用交叉验证，在接收器工作特性曲线（AUC）下，KNN的高面积约为0.88，SVM的高面积约为0.92。在46例HCC患者和31例CHB患者的独立数据集中，使用SVM获得了0.77的良好AUC评分。进一步表明，每个个体的3000次读取可以产生稳定的SVM预测结果。因此，另一个关键发现是单词模式可用于高精度预测HCC状态。因此，我们的研究清楚地表明，HBV序列的单词模式频率包含有关不同HBV基因型的组成和个人HCC状态的许多信息。

著录项

期刊名称 PLoS Genetics
作者
Xin Bai; Jian-an Jia; Meng Fang; Shipeng Chen; Xiaotao Liang; Shanfeng Zhu; Shuqin Zhang; Jianfeng Feng; Fengzhu Sun; Chunfang Gao;
展开▼
作者单位

展开▼
年(卷),期 2018(14),2
年度 2018
页码 e1007206
总页数 20
原文格式 PDF
正文语种
中图分类遗传学;
关键词

相似文献

外文文献
中文文献
专利

1. Deep sequencing of HBV pre-S region reveals high heterogeneity of HBV genotypes and associations of word pattern frequencies with HCC [J] . Xin Bai, Jian-an Jia, Meng Fang, PLoS Genetics . 2018,第2期

机译：HBV pre-S区的深度测序揭示了HBV基因型的高度异质性以及单词模式频率与HCC的关联
2. Deep sequencing analysis of quasispecies in the HBV pre-S region and its association with hepatocellular carcinoma [J] . Zhang An-Ye, Lai Ching-Lung, Huang Fung-Yu, Journal of gastroenterology . 2017,第9期

机译：HBV前地区Quaspecies的深度测序分析及其与肝细胞癌的关系
3. Deep Sequencing Reveals the Characteristics of Hepatitis B Virus (HBV) S Region in Vertical Transmission and the Influence of Mutations on Vaccination Failure [J] . Xiliang Zou, Hu Li, Mingli Peng, Hepatitis Monthly . 2019,第8期

机译：深度测序揭示了垂直传播中乙型肝炎病毒（HBV）区域的特征及突变对疫苗接种衰竭的影响
4. Functional characterization of novel HBV subgenotypes/mutations associated with increased risk for hepatocellular carcinoma (HCC) [D] . Dong, Qingming 2009

机译：与肝细胞癌（HCC）风险增加相关的新型HBV亚型/突变的功能表征
5. Detection of Hepatitis B Virus (HBV) Genomes and HBV Drug Resistant Variants by Deep Sequencing Analysis of HBV Genomes in Immune Cell Subsets of HBV Mono-Infected and/or Human Immunodeficiency Virus Type-1 (HIV-1) and HBV Co-Infected Individuals [O] . Z. Lee, S. Nishikawa, S. Gao, -1

机译：通过深度测序分析HBV单感染和/或人类免疫缺陷病毒1型（HIV-1）和HBV共感染个体的免疫细胞亚型HBV基因组检测乙型肝炎病毒（HBV）基因组和HBV药物抗性变异体
6. Deep sequencing of HBV pre-S region reveals high heterogeneity of HBV genotypes and associations of word pattern frequencies with HCC [O] . Xin Bai, Jian-an Jia, Meng Fang, 2018

机译：HBV前地区的深度测序揭示了HBV基因型的高异质性和HCC的单词模式频率的关联

Deep sequencing of HBV pre-S region reveals high heterogeneity of HBV genotypes and associations of word pattern frequencies with HCC

摘要

著录项

相似文献

相关主题

期刊订阅