首页> 外文期刊>International journal of semantic computing >COMPARING TWO CORPUS-BASED METHODS FOR EXTRACTING PARAPHRASES TO DICTIONARY-BASED METHOD
【24h】

COMPARING TWO CORPUS-BASED METHODS FOR EXTRACTING PARAPHRASES TO DICTIONARY-BASED METHOD

机译:将基于语料库的两种提取参数的方法与基于字典的方法进行比较

获取原文
获取原文并翻译 | 示例
           

摘要

Paraphrase extraction plays an increasingly important role in nguage-related research and applications in areas such as information retrieval, question answering and automatic machine evaluation. Most of the existing methods extract paraphrases from different types of corpora by using syntactic-based approaches. Since a syntactic-based approach relies on the similarity of context to identify and capture paraphrases, other than paraphrases, other terms which tend to appear in a similar context such as loosely related terms and functionally similar yet unrelated terms tend to be extracted. Besides, different types of corpora suffer from different kinds of problems such as limited availability and domain biased. This paper presents a solely semantic-based paraphrase extraction model. This model collects paraphrases from multiple lexical resources and validates those paraphrases semantically in three ways; by computing domain similarity, definition similarity and word similarity. This model is benchmarked with two outstanding syntactic-based approaches. The experimental results from a manual evaluation show that the proposed model outperforms the benchmarks. The results indicate that a semantic-based approach should be applied in paraphrase extraction instead of a syntacticbased approach. The results further suggest that a hybrid of these two approaches should be applied if one targets strictly precise paraphrases.
机译:在信息检索,问题回答和机器自动评估等领域中,与短语相关的研究和应用中,对短语的提取起着越来越重要的作用。大多数现有方法都使用基于句法的方法从不同类型的语料库中提取复述。由于基于句法的方法依赖于上下文的相似性来识别和捕获除复述以外的复述,因此倾向于提取在相似上下文中倾向于出现的其他术语,例如松散相关的术语和功能相似但不相关的术语。此外,不同类型的语料库还遭受不同类型的问题,例如可用性有限和域偏向。本文提出了一种基于语义的释义提取模型。该模型从多种词汇资源中收集释义,并通过三种方式在语义上验证这些释义。通过计算域相似度,定义相似度和词相似度。该模型以两种出色的基于句法的方法作为基准。手动评估的实验结果表明,所提出的模型优于基准。结果表明,应在释义提取中应用基于语义的方法,而不应使用基于句法的方法。结果进一步表明,如果一个针对严格精确的释义,则应采用这两种方法的混合。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号