首页> 外文会议>International Conference on Database Systems for Advanced Applications >A PRELIMINARY STUDY ON THE EXTRACTIONOF SOCIO-TOPICAL WEB KEYWORDS
【24h】

A PRELIMINARY STUDY ON THE EXTRACTIONOF SOCIO-TOPICAL WEB KEYWORDS

机译:社会新型网关键词提取初探

获取原文

摘要

In recent years, the Web has become a popular medium for disseminating information,news, ideas, and opinions of the modem society. Due to this phenomenon, the Webinformation is reflecting current events and trends that are happening in the real worldwhich, in turn, has attracted a lot of interest in using the Web as a sociological researchtool for detecting the emerging topics, and social trends. To facilitate such kind ofsociological research, in this paper, we study the characteristics of socio-topical webkeywords sampled from a series of Thai web snapshots. The socio-topical web keyword,extracted from the content of some web pages, is a keyword relating to some topics ofinterest in a real-world society. The study was conducted as follows. First, the socio-topical keywords were sampled from the inverted index of each Thai web snapshot. Then,for each sampled keyword, we observe the pattern of changes of the number ofdocuments containing the keyword, and the inverse document frequency (IDF) scores.Finally, we try to find the relationships between the observed patterns of changes andtheir corresponding real-world events in the Thai society.
机译:近年来,网络已成为传播调制解调器社会的信息,新闻,想法和意见的流行媒介。由于这种现象,网络信息反映了现实世界中发生的当前事件和趋势,反过来又吸引了很多兴趣,利用网络作为一种社会学研究,用于检测新出现的主题和社会趋势。为了促进这种类型的遗传学研究,在本文中,我们研究了从一系列泰国网络快照采样的社会局部网球字的特征。从某些网页的内容中提取的社会局部网页关键字是与现实世界社会中最有内容的关键字。该研究如下进行。首先,从每个泰式Web快照的反相索引中采样社会局部关键字。然后,对于每个采样的关键字,我们遵守包含关键字的数量的变化模式,以及逆文档频率(IDF)得分。最后,我们试图找到观察到的改变模式和对应的现实世界之间的关系之间的关系泰国社会的活动。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号