Text Mining to Concept Mining: Leads Feature Location in Software System

机译：文本挖掘到概念挖掘：引领软件系统中的特征定位

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In the agile application development environment, automatically identifying relevant components in a large complex software system for software maintenance is still remain a research problem with the proliferation of software applications. Earlier, concept mining with formal concept analysis was one of the commonly applied techniques for legacy software systems of small to medium size. Recently, text mining is being widely used for locating features or concerns in a large complex software system. Nevertheless, the literature study reveals that combining text mining with other techniques always yield better accuracy in locating features. Even though it is efficient, applying formal concept analysis on the large systems poses limitation due to its exponential time complexity in constructing concept lattices. In this research work, a model is devised to combine text mining and concept mining for large systems. The unsupervised machine learning technique, Latent Dirichlet Allocation modeling also called as Topic Modeling is used to reduce the feature space on which K-Means clustering is applied to cluster the related documents and formal concept analysis is carried out on individual clusters. Three open source software systems namely JEdit, ArgoUML and JabRef are considered for the experimental study. The empirical evaluation of feature location measure of the proposed model shows a significant improvement in terms of accuracy, scalability, flexibility and efficiency over the contemporary methods existing in the literature.

机译：在敏捷应用程序开发环境中，随着软件应用程序的激增，自动识别大型复杂软件系统中的相关组件以进行软件维护仍然是一个研究问题。早期，带有正式概念分析的概念挖掘是中小型遗留软件系统的常用技术之一。近来，文本挖掘被广泛用于在大型复杂软件系统中定位功能或关注点。尽管如此，文献研究表明，将文本挖掘与其他技术结合使用总是可以在定位特征时获得更好的准确性。尽管它很有效，但由于在大型概念上进行形式概念分析会在构造概念格时耗费大量时间，因此存在局限性。在这项研究工作中，设计了一个模型，将大型系统的文本挖掘和概念挖掘相结合。一种无监督的机器学习技术，即潜在狄利克雷分配模型（也称为主题模型），用于减少特征空间，在该特征空间上应用K均值聚类对相关文档进行聚类，并对单个聚类进行形式化概念分析。实验研究考虑了三个开源软件系统，即JEdit，ArgoUML和JabRef。对所提出模型的特征位置测量进行的经验评估表明，与文献中现有的现代方法相比，该方法在准确性，可扩展性，灵活性和效率方面均取得了显着改善。

著录项

来源
《IEEE International Conference on Computational Intelligence and Computing Research》|2018年|1-7|共7页
会议地点
作者
A. S. Baby Rani; A. R. Nadira Banu Kamal;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Lattices; Software systems; Text mining; Formal concept analysis; Clustering algorithms;

机译：格;软件系统;文本挖掘;形式概念分析;聚类算法;

相似文献

外文文献
中文文献
专利

1. Text mining for the Vaccine Adverse Event Reporting System: medical text classification using informative feature selection. [J] . Botsis T, Nguyen MD, Woo EJ, Journal of the American Medical Informatics Association : . 2011,第5期

机译：疫苗不良事件报告系统的文本挖掘：使用信息特征选择进行医学文本分类。
2. Pharmspresso: a text mining tool for extraction of pharmacogenomic concepts and relationships from full text [J] . Yael Garten, Russ B Altman BMC Bioinformatics . 2009,第SUPPLEMENTa2期

机译：Pharmspresso：一种文本挖掘工具，用于从全文中提取药物基因组学概念和关系
3. Pharmspresso: a text mining tool for extraction of pharmacogenomic concepts and relationships from full text [J] . Yael Garten, Russ B Altman BMC Bioinformatics . 2009,第SUPPLEMENTa2期

机译：Pharmspresso：一种文本挖掘工具，用于从全文中提取药物基因组学概念和关系
4. Text Mining to Concept Mining: Leads Feature Location in Software System [C] . A. S. Baby Rani, A. R. Nadira Banu Kamal IEEE International Conference on Computational Intelligence and Computing Research . 2018

机译：文本挖掘到概念挖掘：在软件系统中引导功能位置
5. Mining a Shared Concept Space for Domain Adaptation in Text Mining. [D] . Chen, Bo. 2011

机译：在文本挖掘中挖掘用于域适应的共享概念空间。
6. Text mining for the Vaccine Adverse Event Reporting System: medical text classification using informative feature selection [O] . Taxiarchis Botsis, Michael D Nguyen, Emily Jane Woo, 2011

机译：疫苗不良事件报告系统的文本挖掘：使用信息特征选择进行医学文本分类
7. Finding the Canary in Text Mining: Analysis of the Use and Users of MONK Text Mining Research Software [O] . Green Harriett E. 2010

机译：在文本挖掘中寻找金丝雀：分析mONK文本挖掘研究软件的使用和用户
8. Science and Technology Text Mining: Citation Mining of Dynamic Granular Systems [R] . Kostoff, R. N. , Rio, J. A. , Garcia, E. O. , 2003

机译：科技文本挖掘：动态粒度系统的引文挖掘

Text Mining to Concept Mining: Leads Feature Location in Software System

摘要

著录项

相似文献

相关主题

期刊订阅