首页> 外国专利> Generating vector representations of code capturing semantic similarity

Generating vector representations of code capturing semantic similarity

机译:生成代码捕获语义相似性的矢量表示

摘要

A method, system and computer program product for obtaining vector representations of code snippets capturing semantic similarity. A first and second training set of code snippets are collected, where the first training set of code snippets implements the same function representing semantic similarity and the second training set of code snippets implements a different function representing semantic dissimilarity. A vector representation of a first and second code snippet from either the first or second training set of code snippets is generated using a machine learning model. A loss value is generated utilizing a loss function that is proportional or inverse to the distance between the first and second vectors in response to receiving the first and second code snippets from the first or second training set of code snippets, respectively. The machine learning model is trained to capture the semantic similarity in the code snippets by minimizing the loss value.
机译:用于获得捕获语义相似性的代码片段的矢量表示的方法,系统和计算机程序产品。 收集了第一组和第二次训练组代码片段,其中第一个训练组代码片段实现相同的函数,表示语义相似性,第二组代码片段实现了表示语义不相似的不同功能。 使用机器学习模型生成来自第一或第二次训练组代码片段的第一和第二代码片段的矢量表示。 利用损耗函数的损耗函数,其响应于从第一或第二训练组的代码片段接收到第一和第二代码片段,其响应于第一和第二向量之间的距离。 通过最小化损耗值,通过最小化代码片段培训机器学习模型,以捕获代码片段中的语义相似性。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号