首页> 外文会议>Technologies and applications of artificial intelligence >Cross-Domain Opinion Word Identification with Query-By-Committee Active Learning
【24h】

Cross-Domain Opinion Word Identification with Query-By-Committee Active Learning

机译:跨委员会主动学习的跨域意见词识别

获取原文
获取原文并翻译 | 示例

摘要

Opinion word identification (OWI). is an important task for opinion mining. In OWI, it is necessary to find the exact positions of opinion word mentions. Supervised learning approaches can locate such mentions with high accuracy. To construct an OWI system for a new domain, it is necessary to annotate sufficient amounts of data to represent the new domain's characteristics. However, since annotating every new domain extensively is costly, how to best utilize existing annotated data is a very important challenge for mention-based OWI systems. In this work, we propose a cross-domain OWI system. The query by committee (QBC) active learning scheme is used to select controlled amounts of data in the new domain for manual annotation. This new annotated data is used to complement the existing annotated data of the original domain. We compile three annotated datasets, each for one of three different domains, and conduct domain adaptation experiments on all six domain pairs. Our experiments show that by adding only 1,000 newly annotated sentences from the new domain to the existing annotated data, our system can achieve nearly the same level of accuracy as a system trained on 10,000 annotated new-domain sentences. Our system with the QBC active learning scheme also outperforms the same system with a random selection scheme.
机译:意见词识别(OWI)。是挖掘观点的重要任务。在OWI中,有必要找到意见词提及的确切位置。监督学习方法可以高精度地找到此类提及。要为新域构建OWI系统,必须注释足够数量的数据以表示新域的特征。但是,由于对每个新域进行大量注释非常昂贵,因此如何有效利用现有的注释数据对于基于提及的OWI系统来说是一个非常重要的挑战。在这项工作中,我们提出了一个跨域OWI系统。委员会查询(QBC)主动学习方案用于在新域中选择受控数量的数据以进行手动注释。此新的注释数据用于补充原始域的现有注释数据。我们编译了三个带注释的数据集,每个数据集都用于三个不同的域中的一个,并对所有六个域对进行域适应实验。我们的实验表明,通过仅将来自新域的1,000个新注释的句子添加到现有的注释数据中,我们的系统可以达到与在10,000个带注释的新域句子上训练的系统几乎相同的准确性。我们的具有QBC主动学习方案的系统也优于具有随机选择方案的相同系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号