首页> 外文期刊>Advanced engineering informatics >Automated classification of building information modeling (BIM) case studies by BIM use based on natural language processing (NLP) and unsupervised learning
【24h】

Automated classification of building information modeling (BIM) case studies by BIM use based on natural language processing (NLP) and unsupervised learning

机译:基于自然语言处理(NLP)和无监督学习的BIM使用自动分类建筑信息建模(BIM)案例研究

获取原文
获取原文并翻译 | 示例

摘要

This paper comparatively analyzes a method to automatically classify case studies of building information modeling (BIM) in construction projects by BIM use. It generally takes a minimum of thirty minutes to hours of collection and review and an average of four information sources to identify a project that has used BIM in a manner that is of interest. To automate and expedite the analysis tasks, this study deployed natural language processing (NLP) and commonly used unsupervised learning for text classification, namely latent semantic analysis (LSA) and latent Dirichlet allocation (LDA). The results were validated against one of representative supervised learning methods for text classification-support vector machine (SVM). When LSA and LDA detected phrases in a BIM case study that had higher similarity values to the definition of each BIM use than the threshold values, the system determined that the project had deployed BIM in the detected approach. For the classification of BIM use, the BIM uses specified by Pennsylvania State University were utilized. The approach was validated using 240 BIM case studies (512,892 features). When BIM uses were employed in a project, the project was labeled as "1"; when they were not, the project was labeled as "0." The performance was analyzed by changing parameters: namely, document segmentation, feature weighting, dimensionality reduction coefficient (k-value), the number of topics, and the number of iterations. LDA yielded the highest Fl score, 80.75% on average. LDA and LSA yielded high recall and low precision in most cases. Conversely, SVM yielded high precision and low recall in most cases and fluctuations in F1 scores.
机译:本文通过BIM使用对自动分析了自动分类建筑项目建筑信息建模(BIM)案例研究的方法。它通常需要至少30分钟到几小时的收集和审查以及平均四个信息来源,以确定以兴趣的方式使用BIM的项目。自动化和加快分析任务,本研究部署了自然语言处理(NLP),常用的无监督学习文本分类,即潜在语义分析(LSA)和潜在的Dirichlet分配(LDA)。结果针对文本分类支持向量机(SVM)的代表性监督学习方法之一进行了验证。当LSA和LDA检测到BIM案例研究中的短语时,对每个BIM使用的定义具有比阈值的定义更高的相似性值,系统确定该项目在检测到的方法中部署了BIM。对于BIM使用的分类,利用宾夕法尼亚州立大学指定的BIM使用。使用240 BIM案例研究(512,892个功能)验证了该方法。当项目中使用BIM使用时,该项目被标记为“1”;当他们没有时,该项目被标记为“0”通过改变参数来分析性能:即文档分割,特征加权,维度降低系数(k值),主题次数和迭代次数。 LDA平均产生了最高流体,80.75%。在大多数情况下,LDA和LSA在大多数情况下召回高精度和低精度。相反,在大多数情况下,SVM在大多数情况下产生高精度和低召回,并在F1分数中波动。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号