【24h】

Attribute Extracting fromWikipedia Pages in Domain Automatically

机译:自动从Wikipedia页面中提取来自Wikipedia页面

获取原文

摘要

In the age of Big Data, input determines output. There is a large amount of data on the internet, but little knowledge. So researchers develop different kinds of methods to automatically extract knowledge from different data platforms. The traditional methods of supervised learning costmore time and labor,which arewilling to be gradually replaced by the semi-supervised and unsupervised learning methods. In this paper we proposed a newsemi-supervisedmethod to complete this task, which costs just little, called TSVM (Transductive Support Vector Machine). In order to improve the accuracy and the intelligent level, we also add the Word Embeddings to the semi-supervised method. The AP (Affinity Propagation) algorithm makes a contribution to the word clustering automatically. Experimental results demonstrate a better performance to extract the attribute information in the military transportation domain from theWikipedia compared with the traditional supervised leaning method.
机译:在大数据时代,输入确定输出。互联网上有大量数据,但知识很少。因此,研究人员开发了不同种类的方法来自动从不同的数据平台中提取知识。传统的监督学习肋骨时间和劳动力的方法,这令人震惊地被半监督和无人监督的学习方法逐步取代。在本文中,我们提出了一个新闻中心监督的方法,以完成这项任务,这几乎没有称为TSVM(转膜支持向量机)。为了提高准确性和智能级别,我们还将单词嵌入式添加到半监督方法中。 AP(亲和传播)算法自动对单词聚类作出贡献。与传统的监督倾斜法相比,实验结果表明,从泰国田地纤维岩中提取军事交通领域的属性信息更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号