...
首页> 外文期刊>Machine Learning and Knowledge Extraction >Using the Outlier Detection Task to Evaluate Distributional Semantic Models
【24h】

Using the Outlier Detection Task to Evaluate Distributional Semantic Models

机译:使用异常值检测任务评估分布语义模型

获取原文
   

获取外文期刊封面封底 >>

       

摘要

In this article, we define the outlier detection task and use it to compare neural-basedword embeddings with transparent count-based distributional representations. Using the EnglishWikipedia as a text source to train the models, we observed that embeddings outperform count-basedrepresentations when their contexts are made up of bag-of-words. However, there are no sharpdifferences between the two models if the word contexts are defined as syntactic dependencies.In general, syntax-based models tend to perform better than those based on bag-of-words for thisspecific task. Similar experiments were carried out for Portuguese with similar results. The testdatasets we have created for the outlier detection task in English and Portuguese are freely available.
机译:在本文中,我们定义异常值检测任务,并将其用于将基于神经的词嵌入与基于透明计数的分布表示形式进行比较。通过使用EnglishWikipedia作为文本源来训练模型,我们观察到,当嵌入上下文由单词袋组成时,嵌入的效果要优于基于计数的表示。但是,如果将单词上下文定义为句法依存关系,则这两个模型之间不会存在明显差异。通常,基于语法的模型在此特定任务上的性能往往优于基于词袋的模型。对葡萄牙语进行了类似的实验,结果相似。我们为英语和葡萄牙语的异常检测任务创建的测试数据集可免费获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号