...
首页> 外文期刊>International journal of software engineering and knowledge engineering >Developer Identity Linkage and Behavior Mining Across GitHub and StackOverflow
【24h】

Developer Identity Linkage and Behavior Mining Across GitHub and StackOverflow

机译:跨GitHub和StackOverflow的开发人员身份链接和行为挖掘

获取原文
获取原文并翻译 | 示例
           

摘要

Nowadays, software developers are increasingly involved in GitHub and StackOverflow, creating a lot of valuable data in the two communities. Researchers mine the information in these software communities to understand developer behaviors, while previous works mainly focus on mining data within a single community. In this paper, we propose a novel approach to developer identity linkage and behavior mining across GitHub and StackOverflow. This approach links the accounts from two communities using a CART decision tree, leveraging the features from usernames, user behaviors and writing styles. Then, it explores cross-site developer behaviors through T-graph analysis, LDA-based topics clustering and cross-site tagging. We conducted several experiments to evaluate this approach. The results show that the precision and F-score of our identity linkage method are higher than previous methods in software communities. Especially, we discovered that (1) active issue committers are also active question askers; (2) for most developers, the topics of their contents in GitHub are similar to those of those questions and answers in StackOverflow; (3) developers' concerns in StackOverflow shift over the time of their current participating projects in GitHub; (4) developers' concerns in GitHub are more relevant to their answers than questions and comments in StackOverflow.
机译:如今,软件开发人员越来越多地参与GitHub和StackOverflow,在这两个社区中创建了大量有价值的数据。研究人员在这些软件社区中挖掘信息以了解开发人员的行为,而先前的工作主要集中在单个社区内的数据挖掘上。在本文中,我们提出了一种跨GitHub和StackOverflow进行开发人员身份链接和行为挖掘的新颖方法。这种方法利用CART决策树链接来自两个社区的帐户,并利用用户名,用户行为和书写样式中的功能。然后,它通过T图分析,基于LDA的主题聚类和跨站点标记来探索跨站点开发人员的行为。我们进行了一些实验来评估这种方法。结果表明,我们的身份链接方法的精度和F分数高于软件社区中以前的方法。特别是,我们发现(1)活跃问题提交者也是活跃提问者; (2)对于大多数开发人员而言,他们在GitHub上的内容主题与StackOverflow中的那些问题和答案相似; (3)开发人员对StackOverflow的关注随着他们当前在GitHub中参与的项目的时间而转移; (4)与StackOverflow中的问题和评论相比,GitHub中的开发人员关心的问题与其答案更相关。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号