首页> 外文会议>3rd PhD workshop on information and knowledge management 2010 >Entity Classification by Bag of Wikipedia Articles
【24h】

Entity Classification by Bag of Wikipedia Articles

机译:按维基百科文章袋对实体进行分类

获取原文
获取原文并翻译 | 示例

摘要

The input for a Bag-of-Articles (BOA) classifier is a set of unlabeled entities - noun chunks and a set of target labeled entities - Wikipedia articles. The classifier locates Wikipedia articles that might define the unlabeled entity and performs disambiguation selecting one. Both unlabeled and labeled entity is represented with the proposed BOA term weight vector, which is created by aggregating term weight vectors of articles related to the Wikipedia article defining it. The label is assigned by choosing the closest labeled entity, also a BOA term weight vector, with cosine similarity. The paper formally defines the BOA entity representation and BOA-based entity classification and presents a partial software implementation. A BOA-based disambiguation algorithm is presented as a planned extension.
机译:文章袋(BOA)分类器的输入是一组未标记的实体-名词块和一组目标标记的实体-Wikipedia文章。分类器查找可能定义未标记实体的Wikipedia文章,并进行消歧选择。未标记的实体和标记的实体均由建议的BOA术语权重矢量表示,该BOA术语权重矢量是通过汇总与定义它的Wikipedia文章相关的文章的术语权重矢量而创建的。通过选择最接近标记的实体(也是BOA术语权重向量)来定义标签,并具有余弦相似度。本文正式定义了BOA实体表示和基于BOA的实体分类,并提出了部分软件实现。提出了基于BOA的消歧算法作为计划的扩展。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号