首页> 外文会议>2018 International Conference on Engineering, Applied Sciences, and Technology >Automatic Labeling for Thai News Articles Based on Vector Representation of Documents
【24h】

Automatic Labeling for Thai News Articles Based on Vector Representation of Documents

机译:基于文档矢量表示的泰国新闻文章自动标注

获取原文
获取原文并翻译 | 示例

摘要

Nowadays, the most powerful news source in the world comes from online media on the Internet. The information comes from the SNS, video clips, audio clips or various news websites. In this competitive world, many news websites are mainly focused on publishing their contents to the website as fast as they can without taking time to label them correctly. This leads to a problem where readers cannot find news that they are interested in from a large amount of information on the website. In this paper, we propose a method to automatically label articles on the Thai language website using distributed representation of documents. The semantic similar words are extracted from paragraph vectors of each category of news and assign them as labels. We apply the convolutional neural network with binary classification approach to separate words from sentences and the result of the experiments indicated that our method can be applied to automatically label Thai news article effectively.
机译:如今,世界上最强大的新闻来源来自Internet上的在线媒体。该信息来自SNS,视频剪辑,音频剪辑或各种新闻网站。在这个竞争激烈的世界中,许多新闻网站主要集中在尽可能快地将其内容发布到网站上,而无需花费时间正确地标记它们。这导致了一个问题,即读者无法从网站上的大量信息中找到他们感兴趣的新闻。在本文中,我们提出了一种使用文档的分布式表示来自动在泰语网站上标记文章的方法。从每个新闻类别的段落矢量中提取语义相似的单词,并将其指定为标签。我们将卷积神经网络与二进制分类法相结合,将单词和句子分开,实验结果表明,该方法可以有效地自动标记泰语新闻。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号