Enhanced protein domain discovery by using language modeling techniques from speech recognition.

Coin L; Bateman A; Durbin R

首页> 外文期刊>Proceedings of the National Academy of Sciences of the United States of America >Enhanced protein domain discovery by using language modeling techniques from speech recognition.

【24h】

Enhanced protein domain discovery by using language modeling techniques from speech recognition.

机译：通过使用语音识别中的语言建模技术来增强蛋白质结构域发现。

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Most modern speech recognition uses probabilistic models to interpret a sequence of sounds. Hidden Markov models, in particular, are used to recognize words. The same techniques have been adapted to find domains in protein sequences of amino acids. To increase word accuracy in speech recognition, language models are used to capture the information that certain word combinations are more likely than others, thus improving detection based on context. However, to date, these context techniques have not been applied to protein domain discovery. Here we show that the application of statistical language modeling methods can significantly enhance domain recognition in protein sequences. As an example, we discover an unannotated Tf_Otx Pfam domain on the cone rod homeobox protein, which suggests a possible mechanism for how the V242M mutation on this protein causes cone-rod dystrophy.

机译：大多数现代语音识别使用概率模型来解释声音序列。隐马尔可夫模型尤其用于识别单词。已采用相同的技术来发现氨基酸的蛋白质序列中的结构域。为了提高语音识别中的单词准确性，使用语言模型来捕获某些单词组合比其他单词更可能出现的信息，从而改善了基于上下文的检测。但是，迄今为止，这些上下文技术尚未应用于蛋白质结构域发现。在这里，我们表明统计语言建模方法的应用可以显着增强蛋白质序列中的域识别。例如，我们在锥杆同源盒蛋白上发现了一个未注释的Tf_Otx Pfam结构域，这提示了该蛋白上的V242M突变如何引起锥杆营养不良的可能机制。

著录项

来源
《Proceedings of the National Academy of Sciences of the United States of America》 |2003年第8期|P.4516-4520|共5页
作者
Coin L; Bateman A; Durbin R;
展开▼
作者单位

Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SA, United Kingdom.;

展开▼
收录信息美国《科学引文索引》(SCI);美国《生物学医学文摘》(MEDLINE);美国《化学文摘》(CA);
原文格式 PDF
正文语种 eng
中图分类自然科学总论;
关键词
Protein Structure; Tertiary; 蛋白质结构; 三级;

机译：Protein Structure;Tertiary;蛋白质结构;三级;
入库时间 2022-08-18 00:46:29

相似文献

外文文献
中文文献
专利

1. Improving language models for radiology speech recognition. [J] . Paulett JM, Langlotz CP Journal of biomedical informatics. . 2009,第1期

机译：改进放射学语音识别的语言模型。
2. Domain Analysis and Description Principles, Techniques, and Modelling Languages [J] . Bjorner Dines ACM transactions on software engineering and methodology . 2019,第2期

机译：域分析和描述原理，技术和建模语言
3. Comparison of Performance of Enhanced Morpheme-based Language Model with Different Word-based Language Models for Improving the Performance of Tamil Speech Recognition System [J] . S. SARASWATHI, T.V. GEETHA ACM transactions on Asian language information processing . 2007,第3期

机译：增强的基于词素的语言模型与不同的基于单词的语言模型的性能比较，以提高泰米尔语语音识别系统的性能
4. Computer Assisted Integration of Domain-Specific Modeling Languages Using Text Analysis Techniques [C] . Florian Noyrit, Sebastien Gerard, Francois Terrier International conference on model driven engineering languages and systems . 2013

机译：使用文本分析技术的计算机辅助集成特定领域建模语言
5. Modeling articulatory dynamics using HMM techniques for automatic speech recognition. [D] . Erler, Kevin J. 1994

机译：使用HMM技术对发音动力学进行建模以实现自动语音识别。
6. Enhanced protein domain discovery by using language modeling techniques from speech recognition [O] . Lachlan Coin, Alex Bateman, Richard Durbin 2003

机译：通过使用语音识别中的语言建模技术来增强蛋白质结构域发现
7. Enhanced protein domain discovery by using language modeling techniques from speech recognition [O] . Coin, Lachlan, Bateman, Alex, Durbin, Richard 2003

机译：通过使用语音识别中的语言建模技术来增强蛋白质结构域发现

Enhanced protein domain discovery by using language modeling techniques from speech recognition.

摘要

著录项

相似文献

相关主题

期刊订阅