首页> 外文会议>International semantic web conference >Feature LDA: A Supervised Topic Model for Automatic Detection of Web API Documentations from the Web
【24h】

Feature LDA: A Supervised Topic Model for Automatic Detection of Web API Documentations from the Web

机译:LDA功能:一种用于从Web自动检测Web API文档的监督主题模型

获取原文

摘要

Web APIs have gained increasing popularity in recent Web service technology development owing to its simplicity of technology stack and the proliferation of mashups. However, efficiently discovering Web APIs and the relevant documentations on the Web is still a challenging task even with the best resources available on the Web. In this paper we cast the problem of detecting the Web API documentations as a text classification problem of classifying a given Web page as Web API associated or not. We propose a supervised generative topic model called feature latent Dirichlet allocation (feaLDA) which offers a generic probabilistic framework for automatic detection of Web APIs. feaLDA not only captures the correspondence between data and the associated class labels, but also provides a mechanism for incorporating side information such as labelled features automatically learned from data that can effectively help improving classification performance. Extensive experiments on our Web APIs documentation dataset shows that the feaLDA model outperforms three strong supervised baselines including naive Bayes, support vector machines, and the maximum entropy model, by over 3% in classification accuracy. In addition, feaLDA also gives superior performance when compared against other existing supervised topic models.
机译:由于其技术堆栈的简单性和混搭的泛滥,Web API在最近的Web服务技术开发中已变得越来越流行。但是,即使在Web上拥有最佳资源,如何在Web上有效地发现Web API和相关文档仍然是一项艰巨的任务。在本文中,我们将检测Web API文档的问题归结为将给定网页分类为是否与Web API关联的文本分类问题。我们提出了一个监督的生成主题模型,称为特征潜在Dirichlet分配(feaLDA),该模型提供了自动检测Web API的通用概率框架。 feaLDA不仅捕获数据与关联的类别标签之间的对应关系,而且还提供了一种机制,用于合并诸如从数据中自动学习的标记特征之类的辅助信息,从而可以有效地帮助改善分类性能。在我们的Web APIs文档数据集中进行的大量实验表明,feaLDA模型的性能优于3个强监督基线,包括朴素贝叶斯,支持向量机和最大熵模型,其分类精度超过3%。此外,与其他现有的受监管主题模型相比,feaLDA还具有出色的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号