首页> 外文会议>Pacific Asia Conference on Language, Information and Computation >Trouble information extraction based on a bootstrap approach from Twitter
【24h】

Trouble information extraction based on a bootstrap approach from Twitter

机译:基于从Twitter的引导方法的故障信息提取

获取原文

摘要

In this paper, we propose a method for extracting trouble information from Twitter. One useful approach is based on machine learning techniques such as SVMs. However, trouble information is a fraction of a percent of all tweets on Twitter. In general, imbalanced distribution is not suitable for machine learning techniques to generate a classifier. Another approach is to extract trouble information by using handwritten rules. However, constructing high coverage rules by handwork is costly. First, we verify these problems in a preliminary experiment. Then, to solve these problems, we apply a bootstrapping method to our trouble information extraction task. We introduce three characteristics and a scoring method to the bootstrapping. As a result, the iteration process on the bootstrapping increased the number of tweets and patterns for trouble information dramatically.
机译:在本文中,我们提出了一种从Twitter中提取故障信息的方法。一种有用的方法是基于机器学习技术,如SVM。但是,麻烦信息是Twitter上所有推文百分比的一小部分。通常,不平衡的分布不适合生成分类器的机器学习技术。另一种方法是通过使用手写规则提取故障信息。但是,通过手动构建高覆盖规则是昂贵的。首先,我们在初步实验中验证了这些问题。然后,要解决这些问题,我们将引导方法应用于我们的故障信息提取任务。我们介绍了三个特征和对自动启动的评分方法。因此,对自动启动的迭代过程增加了急剧信息的发布次数和模式的数量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号