首页> 外文会议>Working Conference on Mining Software Repositories >Topic Modeling of NASA Space System Problem Reports: Research in Practice
【24h】

Topic Modeling of NASA Space System Problem Reports: Research in Practice

机译:美国宇航局空间系统问题报告主题建模:实践中的研究

获取原文

摘要

Problem reports at NASA are similar to bug reports: they capture defects found during test, post-launch operational anomalies, and document the investigation and corrective action of the issue. These artifacts are a rich source of lessons learned for NASA, but are expensive to analyze since problem reports are comprised primarily of natural language text. We apply {topic modeling to a corpus of NASA problem reports to extract trends in testing and operational failures. We collected 16,669 problem reports from six NASA space flight missions and applied Latent Dirichlet Allocation topic modeling to the document corpus. We analyze the most popular topics within and across missions, and how popular topics changed over the lifetime of a mission. We find that hardware material and flight software issues are common during the integration and testing phase, while ground station software and equipment issues are more common during the operations phase. We identify a number of challenges in topic modeling for trend analysis: (1) that the process of selecting the topic modeling parameters lacks definitive guidance, (2) defining semantically-meaningful topic labels requires non-trivial effort and domain expertise, (3) topic models derived from the combined corpus of the six missions were biased toward the larger missions, and (4) topics must be semantically distinct as well as cohesive to be useful. Nonetheless, topic modeling can identify problem themes within missions and across mission lifetimes, providing useful feedback to engineers and project managers.
机译:NASA的问题报告类似于错误报告:它们捕获测试期间发现的缺陷,发布后的运营异常,以及记录问题的调查和纠正行动。这些工件是为美国国家航空航天局学习的丰富的经验教训来源,但由于问题报告主要包括自然语言文本,因此分析昂贵。我们将{主题建模应用于NASA问题报告的语料库中,以提取测试和操作失败的趋势。我们收集了来自六个NASA空间飞行任务的16,669个问题报告,并应用于文档语料库的潜在Dirichlet分配主题。我们分析了在任务的一生中改变了最受任务的最受欢迎的主题。我们发现,在集成和测试阶段,硬件材料和飞行软件问题很常见,而在运营阶段则地面站软件和设备问题更常见。我们在趋势分析模型中确定了许多挑战:(1)选择主题建模参数的过程缺乏明确的指导,(2)定义语义有意义的主题标签需要非琐碎的努力和域专业知识,(3)从六个任务的组合语料库中源于较大的任务的主题模型偏向,(4)主题必须是语义截然不同的,也有用。尽管如此,主题建模可以识别任务中的问题主题和在任务生命周期内,为工程师和项目经理提供有用的反馈。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号