首页> 外文会议>Working Conference on Mining Software Repositories >Automatically mining software-based, semantically-similar words from comment-code mappings
【24h】

Automatically mining software-based, semantically-similar words from comment-code mappings

机译:从注释代码映射中自动挖掘基于软件的,语义相似的单词

获取原文

摘要

Many software development and maintenance tools involve matching between natural language words in different software artifacts (e.g., traceability) or between queries submitted by a user and software artifacts (e.g., code search). Because different people likely created the queries and various artifacts, the effectiveness of these tools is often improved by expanding queries and adding related words to textual artifact representations. Synonyms are particularly useful to overcome the mismatch in vocabularies, as well as other word relations that indicate semantic similarity. However, experience shows that many words are semantically similar in computer science situations, but not in typical natural language documents. In this paper, we present an automatic technique to mine semantically similar words, particularly in the software context. We leverage the role of leading comments for methods and programmer conventions in writing them. Our evaluation of our mined related comment-code word mappings that do not already occur in WordNet are indeed viewed as computer science, semantically-similar word pairs in high proportions.
机译:许多软件开发和维护工具涉及不同软件工件中的自然语言单词之间的匹配(例如,可追溯性)或用户提交的查询与软件工件之间的匹配(例如,代码搜索)。由于不同的人可能创建了查询和各种工件,因此通常可以通过扩展查询并将相关词添加到文本工件表示中来提高这些工具的有效性。同义词对于克服词汇表以及指示语义相似性的其他单词关系方面的不匹配特别有用。但是,经验表明,在计算机科学领域,许多单词在语义上是相似的,但在典型的自然语言文档中却不一样。在本文中,我们提出了一种自动技术,用于挖掘语义相似的单词,尤其是在软件上下文中。在编写方法和程序员约定时,我们利用前导注释的作用。我们对在WordNet中尚未出现的挖掘相关注释代码单词映射的评估确实被视为计算机科学,语义相似的单词对比例很高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号