Generating Question Titles for Stack Overflow from Mined Code Snippets

ZHIPENG GAO; XIN XIA; JOHN GRUNDY; DAVID LO; YUAN-FANG LI

首页> 外文期刊>ACM transactions on software engineering and methodology >Generating Question Titles for Stack Overflow from Mined Code Snippets

【24h】

Generating Question Titles for Stack Overflow from Mined Code Snippets

机译：生成堆栈溢出的问题标题，来自挖掘代码片段

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Stack Overflow has been heavily used by software developers as a popular way to seek programming-related information from peers via the internet. The Stack Overflow community recommends users to provide the related code snippet when they are creating a question to help others better understand it and offer their help. Previous studies have shown that a significant number of these questions are of low-quality and not attractive to other potential experts in Stack Overflow. These poorly asked questions are less likely to receive useful answers and hinder the overall knowledge generation and sharing process. Considering one of the reasons for introducing low-quality questions in SO is that many developers may not be able to clarify and summarize the key problems behind their presented code snippets due to their lack of knowledge and terminology related to the problem, and/or their poor writing skills, in this study we propose an approach to assist developers in writing high-quality questions by automatically generating question titles for a code snippet using a deep sequence-to-sequence learning approach. Our approach is fully data-driven and uses an attention mechanism to perform better content selection, a copy mechanism to handle the rare-words problem and a coverage mechanism to eliminate word repetition problem. We evaluate our approach on Stack Overflow datasets over a variety of programming languages (e.g., Python, Java, Javascript, C# and SQL) and our experimental results show that our approach significantly outperforms several state-of-the-art baselines in both automatic and human evaluation. We have released our code and datasets to facilitate other researchers to verify their ideas and inspire the follow up work.

机译：STACK OVERFLOW由软件开发人员普遍使用，作为通过Internet从对等体寻求编程相关信息的流行方式。堆栈溢出社区建议用户在创建问题时提供相关代码片段，以帮助其他人更好地了解并提供他们的帮助。以前的研究表明，其中大量这些问题具有低质量，并且对堆栈溢出中的其他潜在专家没有吸引力。这些令人难度的问题不太可能接受有用的答案并阻碍整体知识生成和共享过程。考虑到引入低质量问题的原因之一是，由于缺乏与问题相关的知识和术语，以及/或其相关的知识和术语，许多开发人员可能无法澄清和总结其所提出的代码片段背后的关键问题在本研究中，写作技巧不佳，我们提出了一种方法来帮助开发人员通过使用深序序列到序列学习方法自动为代码片段生成问题标题来编写高质量问题。我们的方法是完全数据驱动的，并使用注意机制来执行更好的内容选择，是处理稀有词问题的复制机制和消除词重复问题的覆盖机制。我们评估我们在各种编程语言上的堆栈溢出数据集（例如，Python，Java，JavaScript，C＃和SQL）以及我们的实验结果表明，我们的方法显着优于自动和自动的最先进的基本线人体评估。我们发布了我们的代码和数据集，以促进其他研究人员验证他们的想法并激励跟进工作。

著录项

来源
《ACM transactions on software engineering and methodology》 |2020年第4期|26.1-26.37|共37页
作者
ZHIPENG GAO; XIN XIA; JOHN GRUNDY; DAVID LO; YUAN-FANG LI;
展开▼
作者单位

Monash University Australia;

Monash University Australia;

Monash University Australia;

Singapore Management University Singapore;

Monash University Australia;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Stack overflow; question generation; question quality; sequence-to-sequence;

机译：堆栈溢出;问题生成;质量问题;序列到序列;

相似文献

外文文献
中文文献
专利

1. SCC++: Predicting the programming language of questions and snippets of Stack Overflow [J] . Kamel Alrashedy, Dhanush Dharmaretnam, Daniel M. German, The Journal of Systems and Software . 2020,第Apra期

机译：SCC ++：预测问题和堆栈溢出摘要的编程语言
2. Toxic Code Snippets on Stack Overflow [J] . Ragkhitwetsagul Chaiyong, Krinke Jens, Paixao Matheus, IEEE Transactions on Software Engineering . 2021,第3期

机译：堆栈溢出的有毒代码片段
3. Usage and attribution of Stack Overflow code snippets in GitHub projects [J] . Baltes Sebastian, Diehl Stephan Empirical Software Engineering . 2019,第3期

机译：GitHub项目中Stack Overflow代码段的用法和归因
4. From Query to Usable Code: An Analysis of Stack Overflow Code Snippets [C] . Di Yang, Aftab Hussain, Cristina Videira Lopes Working Conference on Mining Software Repositories . 2016

机译：从查询到可用代码：堆栈溢出代码片段分析
5. Design and Evaluation of a Library of Coding Snippets to Support Novice Programmers [D] . Jain, Shruti . 2019

机译：编码片段图书馆的设计与评估支持新手程序员
6. Determinants of quality latency and amount of Stack Overflow answers about recent Android APIs [O] . David Kavaler, Vladimir Filkov -1

机译：有关最近的Android API的质量延迟和堆栈溢出量的决定因素
7. Generating Question Titles for Stack Overflow from Mined Code Snippets [O] . Zhipeng Gao, Xin Xia, John Grundy, 2020

机译：生成堆栈溢出的问题标题，来自挖掘代码片段
8. Answers to Questions at Public Meetings Regarding Implementation of Title 10, Code of Federal Regulations, Part 55 on Operators' Licenses [R] . Bridges, T. L. 1987

机译：公共会议上关于实施第10章“联邦法规”，第55部分关于运营商许可的问题的答案

Generating Question Titles for Stack Overflow from Mined Code Snippets

摘要

著录项

相似文献

相关主题

期刊订阅