首页> 外文期刊>IEEE Transactions on Software Engineering >Mining Likely Analogical APIs Across Third-Party Libraries via Large-Scale Unsupervised API Semantics Embedding
【24h】

Mining Likely Analogical APIs Across Third-Party Libraries via Large-Scale Unsupervised API Semantics Embedding

机译:通过大规模无人监督的API语义嵌入嵌入第三方图书馆,采矿的可能性API

获取原文
获取原文并翻译 | 示例

摘要

Establishing API mappings between third-party libraries is a prerequisite step for library migration tasks. Manually establishing API mappings is tedious due to the large number of APIs to be examined. Having an automatic technique to create a database of likely API mappings can significantly ease the task. Unfortunately, existing techniques either adopt supervised learning mechanism that requires already-ported or functionality similar applications across major programming languages or platforms, which are difficult to come by for an arbitrary pair of third-party libraries, or cannot deal with lexical gap in the API descriptions of different libraries. To overcome these limitations, we present an unsupervised deep learning based approach to embed both API usage semantics and API description (name and document) semantics into vector space for inferring likely analogical API mappings between libraries. Based on deep learning models trained using tens of millions of API call sequences, method names and comments of 2.8 millions of methods from 135,127 GitHub projects, our approach significantly outperforms other deep learning or traditional information retrieval (IR) methods for inferring likely analogical APIs. We implement a proof-of-concept website (https://similarapi.appspot.com) which can recommend analogical APIs for 583,501 APIs of 111 pairs of analogical Java libraries with diverse functionalities. This scale of third-party analogical-API database has never been achieved before.
机译:在第三方库之间建立API映射是库迁移任务的先决条件步骤。由于要检查的API数量,手动建立API映射是乏味的。具有自动技术创建可能API映射的数据库可以显着缓解任务。不幸的是,现有技术要么采用受监督的学习机制,需要跨主要编程语言或平台的已经移植或功能类似的应用程序,这很难通过用于任意对第三方库,或者不能处理API中的词汇间隙不同图书馆的描述。为了克服这些限制,我们介绍了一个无监督的基于深度学习的方法,将API使用语义和API描述(姓名和文档)语义嵌入到矢量空间中,以推断图书馆之间的可能性API映射。基于使用数百万API呼叫序列培训的深度学习模型,从135,127个GitHub项目的方法名称和评论中的2.8百万条评论,我们的方法显着优于其他深度学习或传统信息检索(IR)方法,用于推断可能的类似类比。我们实现了概念验证网站(https://similarapi.appspot.com),它可以推荐583,501个API的模拟API,其中111对具有不同功能的三个类似的Java库。此前从未实现过这种第三方类似API数据库的规模。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号