首页> 美国卫生研究院文献>BMC Bioinformatics >Building a biomedical tokenizer using the token lattice design pattern and the adapted Viterbi algorithm

【2h】

Building a biomedical tokenizer using the token lattice design pattern and the adapted Viterbi algorithm

机译：使用令牌格设计模式和适应的Viterbi算法构建生物医学令牌生成器

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

BackgroundTokenization is an important component of language processing yet there is no widely accepted tokenization method for English texts, including biomedical texts. Other than rule based techniques, tokenization in the biomedical domain has been regarded as a classification task. Biomedical classifier-based tokenizers either split or join textual objects through classification to form tokens. The idiosyncratic nature of each biomedical tokenizer’s output complicates adoption and reuse. Furthermore, biomedical tokenizers generally lack guidance on how to apply an existing tokenizer to a new domain (subdomain). We identify and complete a novel tokenizer design pattern and suggest a systematic approach to tokenizer creation. We implement a tokenizer based on our design pattern that combines regular expressions and machine learning. Our machine learning approach differs from the previous split-join classification approaches. We evaluate our approach against three other tokenizers on the task of tokenizing biomedical text.

机译：背景标记化是语言处理的重要组成部分，但是还没有广泛接受的针对英语文本（包括生物医学文本）的标记化方法。除了基于规则的技术外，生物医学领域中的标记化还被视为分类任务。基于生物医学分类器的令牌生成器通过分类将文本对象拆分或合并以形成令牌。每个生物医学令牌生成器输出的特质都使采用和重用变得复杂。此外，生物医学令牌生成器通常缺乏有关如何将现有令牌生成器应用于新域（子域）的指南。我们确定并完成了一种新颖的令牌生成器设计模式，并提出了一种系统的令牌生成器创建方法。我们基于结合正则表达式和机器学习的设计模式实现标记器。我们的机器学习方法与以前的拆分联接分类方法不同。我们在标记生物医学文本的任务上，与其他三个标记器一起评估了我们的方法。

著录项

期刊名称 BMC Bioinformatics
作者
Neil Barrett; Jens Weber-Jahnke;
展开▼
作者单位

展开▼
年(卷),期 2011(12),Suppl 3
年度 2011
页码 S1
总页数 11
原文格式 PDF
正文语种
中图分类应用微生物学;生化遗传学;生化药理学;
关键词

相似文献

外文文献
中文文献
专利

1. Building a biomedical tokenizer using the token lattice design pattern and the adapted Viterbi algorithm [J] . Neil Barrett, Jens Weber-Jahnke BMC Bioinformatics . 2011,第SUPPLEMENTa3期

机译：使用令牌格子设计模式和适应的维特比算法构建生物医学销售器
2. Token Tenure and PATCH: A Predictive/Adaptive Token-Counting Hybrid [J] . ARUN RAGHAVAN, COLIN BLUNDELL, MILO M. K. MARTIN ACM Transactions on Architecture and Code Optimization . 2010,第2期

机译：令牌使用期限和PATCH：预测/自适应令牌计数混合
3. Token Bucket Fair Scheduling Algorithm with Adaptive Rate Allocations for Heterogeneous Wireless Networks [J] . AlQahtani Salman A. Wireless personal communications: An Internaional Journal . 2015,第2期

机译：异构无线网络中具有自适应速率分配的令牌桶公平调度算法
4. Building a Biomedical Tokenizer Using the Token Lattice Design Pattern and the Adapted Viterbi Algorithm [C] . Barrett Neil, Weber-Jahnke Jens Ninth International Conference on Machine Learning and Applications . 2010

机译：使用令牌格设计模式和自适应维特比算法构建生物医学令牌生成器
5. Adaptive Token Bank Fair Queuing scheduling in the downlink of 4G wireless networks. [D] . Bokhari, Feroz A. 2008

机译：4G无线网络下行链路中的自适应令牌库公平排队调度。
6. Algorithmic shaping and misbehavior in the acquisition of token deposit by rats [O] . Marie Midgley, Stephen E. G. Lea, Rachel M. Kirby 1989

机译：大鼠获取代币存款的算法整形和不良行为
7. Building a biomedical tokenizer using the token lattice design pattern and the adapted Viterbi algorithm [O] . Neil Barrett, Jens Weber-Jahnke 2011

机译：使用令牌格设计模式和适应的Viterbi算法构建生物医学令牌生成器
8. Token Execution Strategies for Distributed Algorithms: Simulation Studies [R] . Lloyd, M. J. 1987

机译：分布式算法的令牌执行策略：仿真研究

Building a biomedical tokenizer using the token lattice design pattern and the adapted Viterbi algorithm

摘要

著录项

相似文献

相关主题

期刊订阅