The Data Crawling and Hotspot Analyze of Social QA Site

机译：社会问答网站的数据爬网和热点分析

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Along with the rapid development of the Internet, more specialized and detailed information sources like Q&A sites have gradually come into being. On these social Q&A platforms, there are plenty of hot topics and news being discussed and even created every minute. Therefore, it is of great practical significance to learn about hot social issues by analyzing and parsing the content on social Q&A platforms. By taking a social Q&A platform as the research subject, this paper analyzes the difficulties in crawling data from this platform and relevant solutions, designs and implement a data crawling system containing a user information storage module, a highly anonymous and available proxy maintenance module, a node crawling and parsing module, and a data storage module. With these modules, the system is able to crawl data and store it without being restricted by the platform. On this basis, this paper designs and implements a hotspot parsing and grading module. Based on echarts, a historical hotspot display module and a trending hotspot display module are designed to show the historical and trending hotspots on this platform. Then, this paper uses the proposed data crawling module and the hotspot analysis and display system to obtain the data of 31,520 regularized independent topics and the real-time data of 979,815 questions from this social Q&A platform. Based on these data, the historical and trending hotspot analysis on this platform is displayed. The experimental results show that this system has fully met the design objectives. Finally, this research summarizes the proposed data crawling and hotspot analysis system and provides reference and directions for future work.

机译：随着互联网的快速发展，如Q＆A网站的更多专业化和详细的信息来源逐渐变成。在这些社交问答平台上，有很多热门话题和新闻正在讨论，甚至每分钟创造。因此，通过分析和解析社会问答平台上的内容，了解热门社会问题具有很大的实际意义。通过将社交Q＆A平台作为研究主题，分析了来自该平台和相关解决方案的爬行数据的困难，设计和实施包含用户信息存储模块，高度匿名和可用的代理维护模块的数据爬网系统节点爬网和解析模块，以及数据存储模块。使用这些模块，系统能够爬网数据并将其存储在不受平台的情况下。在此基础上，本文设计并实现了热点解析和分级模块。基于ECHART，历史热点显示模块和趋势热点显示模块旨在显示该平台上的历史和趋势热点。然后，本文采用所提出的数据爬行模块和热点分析和显示系统，以获得31,520个正则化的独立主题和该社交Q＆A平台的979,815个问题的实时数据。基于这些数据，显示了对该平台的历史和趋势热点分析。实验结果表明，该系统完全符合了设计目标。最后，本研究总结了所提出的数据爬行和热点分析系统，并为未来工作提供参考和方向。

著录项

来源
《International Conference on Network and Information Systems for Computers》|2017年|1 v.|共5页
会议地点
作者
Rui-Hui Jia;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
Market research; IP networks; Internet; Heating systems; Real-time systems; Data acquisition; Memory;

机译：市场研究;IP网络;互联网;加热系统;实时系统;数据采集;记忆;

相似文献

外文文献
中文文献
专利

1. Efficient Assessment of Social Hotspots in the Supply Chains of 100 Product Categories Using the Social Hotspots Database [J] . Catherine Beno#xEE, t Norris, Deana Aulisio, Sustainability . 2014,第10期

机译：使用社交热点数据库对100个产品类别的供应链中的社交热点进行有效评估
2. A hybrid human dynamics model on analyzing hotspots in social networks [J] . Xiao Y., Wang B., Wu B., Discrete dynamics in nature and society . 2012,第Pta3期

机译：用于分析社交网络热点的混合人类动力学模型
3. A Hybrid Human Dynamics Model on Analyzing Hotspots in Social Networks [J] . YunpengXiao, BaiWang, BinWu, Discrete dynamics in nature and society . 2012,第4期

机译：社交网络热点分析的混合人类动力学模型
4. The Data Crawling and Hotspot Analyze of Social QA Site [C] . Rui-Hui Jia International Conference on Network and Information Systems for Computers . 2017

机译：社交问答站点的数据爬网和热点分析
5. Analyzing social space: Interpreting spatial patterning at archaeological sites using ethnoarchaeological data. [D] . Heyman, Marjorie A. W. 2009

机译：分析社会空间：使用民族考古数据解释考古现场的空间格局。
6. Topic Modeling for Analyzing Patients’ Perceptions and Concerns of Hearing Loss on Social QA Sites: Incorporating Patients’ Perspective [O] . Junghwa Bahng, Chang Heon Lee 2020

机译：分析患者对社会问答障碍损失患者的看法和担忧的主题建模：纳入患者的观点
7. Topic Modeling for Analyzing Patients’ Perceptions and Concerns of Hearing Loss on Social QA Sites: Incorporating Patients’ Perspective [O] . Junghwa Bahng, Chang Heon Lee 2020

机译：分析患者对社会问答障碍损失患者的看法和担忧的主题建模：纳入患者的观点

The Data Crawling and Hotspot Analyze of Social QA Site

摘要

著录项

相似文献

相关主题

期刊订阅