...
首页> 外文期刊>Signal Processing Letters, IEEE >FASTSUBS: An Efficient and Exact Procedure for Finding the Most Likely Lexical Substitutes Based on an N-Gram Language Model
【24h】

FASTSUBS: An Efficient and Exact Procedure for Finding the Most Likely Lexical Substitutes Based on an N-Gram Language Model

机译:FASTSUBS:一种基于N-Gram语言模型查找最可能的词性替换的高效且精确的过程

获取原文
获取原文并翻译 | 示例
           

摘要

Lexical substitutes have found use in areas such as paraphrasing, text simplification, machine translation, word sense disambiguation, and part of speech induction. However the computational complexity of accurately identifying the most likely substitutes for a word has made large scale experiments difficult. In this letter we introduce a new search algorithm, fastsubs, that is guaranteed to find the $K$ most likely lexical substitutes for a given word in a sentence based on an n-gram language model. The computation is sublinear in both $K$ and the vocabulary size $V$ . An implementation of the algorithm and a dataset with the top 100 substitutes of each token in the WSJ section of the Penn Treebank are available at http://goo.gl/jzKH0.
机译:词汇替代词已在释义,简化文本,机器翻译,词义歧义消除和语音诱导等领域得到了使用。但是,准确识别单词的最可能替代词的计算复杂性使大规模实验变得困难。在这封信中,我们介绍了一种新的搜索算法fastsubs,该算法可确保在基于n元语法模型的句子中找到给定单词的$ K $最有可能的词汇替换。 $ K $和词汇量$ V $都是次线性的。可在http://goo.gl/jzKH0获得该算法的实现以及Penn Treebank的WSJ部分中每个令牌的前100个替换项的数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号