...
首页> 外文期刊>Empirical Software Engineering >APIReal: an API recognition and linking approach for online developer forums
【24h】

APIReal: an API recognition and linking approach for online developer forums

机译:APIReal:在线开发者论坛的API识别和链接方法

获取原文
获取原文并翻译 | 示例

摘要

When discussing programming issues on social platforms (e.g, Stack Overflow, Twitter), developers often mention APIs in natural language texts. Extracting API mentions from natural language texts serves as the prerequisite to effective indexing and searching for API-related information in software engineering social content. The task of extracting API mentions from natural language texts involves two steps: 1) distinguishing API mentions from other English words (i.e., API recognition), 2) disambiguating a recognized API mention to its unique fully qualified name (i.e., API linking). Software engineering social content lacks consistent API mentions and sentence writing format. As a result, API recognition and linking have to deal with the inherent ambiguity of API mentions in informal text, for example, due to the ambiguity between the API sense of a common word and the normal sense of the word (e.g., append, apply and merge), the simple name of an API can map to several APIs of the same library or of different libraries, or different writing forms of an API should be linked to the same API. In this paper, we propose a semi-supervised machine learning approach that exploits name synonyms and rich semantic context of API mentions for API recognition in informal text. Based on the results of our API recognition approach, we further propose an API linking approach leveraging a set of domain-specific heuristics, including mention-mention similarity, scope filtering, and mention-entry similarity, to determine which API in the knowledge base a recognized API actually refers to. To evaluate our API recognition approach, we use 1205 API mentions of three libraries (Pandas, Numpy, and Matplotlib) from Stack Overflow text. We also evaluate our API linking approach with 120 recognized API mentions of these three libraries.
机译:在社交平台(例如Stack Overflow,Twitter)上讨论编程问题时,开发人员经常以自然语言文字提及API。从自然语言文本中提取API提及是有效索引和搜索软件工程社交内容中与API相关的信息的前提。从自然语言文本中提取API提及的任务涉及两个步骤:1)将API提及与其他英语单词(即API识别)区分开来; 2)将公认的API提及与其唯一的完全限定名称歧义(即API链接)。软件工程社交内容缺乏一致的API提及和句子编写格式。结果,API识别和链接必须处理非正式文本中API提及的固有歧义,例如,由于一个普通单词的API含义和该单词的普通含义(例如,追加,应用)之间存在歧义和合并),API的简单名称可以映射到同一库或不同库的多个API,或者API的不同书写形式应链接到同一API。在本文中,我们提出了一种半监督机器学习方法,该方法利用名称同义词和API提及的丰富语义上下文来识别非正式文本中的API。根据API识别方法的结果,我们进一步提出一种API链接方法,该方法利用一组特定于域的启发式方法(包括提及提法相似性,作用域过滤和提及条目相似性)来确定知识库中哪个API公认的API实际上是指。为了评估我们的API识别方法,我们使用了Stack Overflow文本中1203个API提及的三个库(Pandas,Numpy和Matplotlib)。我们还将用这三个库的120个公认的API提及来评估我们的API链接方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号