Extracting Various Types of Informative Web Content via Fuzzy Sequential Pattern Mining

机译：通过模糊顺序模式挖掘提取各种类型的信息网络内容

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we present a web content extraction method to extract different types of informative web content for news web pages. A fuzzy sequential pattern mining method, namely FSP, is developed to gradually discover fuzzy sequential patterns for various types of informative web content. To avoid the situation that the usage of HTML tags may be changed with the development of web technology, fuzzy sequential patterns are mined using a stable feature, in particular, the number of tokens in each line of source code. We have conducted extensive experiments and good clustering properties for the discovered sequential patterns are observed. Experimental results demonstrate that the FSP method is effective compared with state-of-the-art content extraction methods. Besides main articles of web pages, it can also find other types interesting web content such as article recommendations and article titles effectively.

机译：在本文中，我们介绍了一个Web内容提取方法，用于提取新闻网页的不同类型的信息Web内容。模糊顺序模式挖掘方法，即FSP，用于逐步发现各种类型的信息Web内容的模糊顺序模式。为了避免使用Web技术的开发可以改变使用HTML标签的情况，使用稳定的特征，特别地，源代码中的令牌的数量进行模糊顺序模式。我们对发现的顺序模式进行了广泛的实验和良好的聚类性质。实验结果表明，与最先进的内容提取方法相比，FSP方法是有效的。除了主要的网页文章外，还可以找到其他类型的Web内容，如文章建议和文章标题。

著录项

来源
《Asia Pacific Web and Web-Age Information Management》|2017年|662p|共9页
会议地点
作者
Ting Huang; Ruizhang Huang; Bowei Liu; Yingying Yan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP393-53;
关键词
Content extraction; Fuzzy sequential pattern; Recommendation discovery;

机译：内容提取;模糊顺序模式;推荐发现;

相似文献

外文文献
中文文献
专利

1. A FUZZY DATA MINING ALGORITHM FOR INCREMENTAL MINING OF QUANTITATIVE SEQUENTIAL PATTERNS [J] . R. B. V. SUBRAMANYAM, A. GOSWAMI International Journal of Uncertainty, Fuzziness, and Knowledge-based Systems . 2005,第6期

机译：定量时序模式增量挖掘的模糊数据挖掘算法
2. Mining Fuzzy Sequential Patterns with Fuzzy Time-Intervals in Quantitative Sequence Databases [J] . Truong Duc Phuong, Do Van Thanh, Nguyen Duc Dung Cybernetics and information technologies: CIT . 2017,第2期

机译：定量序列数据库中具有模糊时间间隔的模糊序列模式的挖掘
3. Simple Fuzzy Grid Partition For Mining Multiple-level Fuzzy Sequential Patterns [J] . YI, CHUNG HU Cybernetics and Systems . 2007,第2期

机译：用于挖掘多级模糊顺序模式的简单模糊网格划分
4. Extracting Various Types of Informative Web Content via Fuzzy Sequential Pattern Mining [C] . Ting Huang, Ruizhang Huang, Bowei Liu, Aisa-Pacific Web and Web-Age Information Management Joint Conference on Web and Big Data . 2017

机译：通过模糊顺序模式挖掘提取各种类型的信息性Web内容
5. Combined mining of Web server logs and Web contents for classifying user navigation patterns and predicting users' future requests. [D] . Liu, Haibin. 2005

机译：结合挖掘Web服务器日志和Web内容，以对用户导航模式进行分类并预测用户的未来请求。
6. FuzzyGap: Sequential Pattern Mining for Predicting Chronic Heart Failure in Clinical Pathways [O] . Eric W. Lee, Joyce C. Ho 2019

机译：FuzzyGap：预测临床路径中慢性心力衰竭的顺序模式挖掘
7. Using Domain Ontology and Sequential Rule Mining for Extracting Behavior Patterns from Web Navigation Logs [O] . C. Ramesh, K.V. Chalapathi Rao, A. Govardhan 2014

机译：使用域本体和顺序规则挖掘用于从Web导航日志中提取行为模式
8. Using the Random Nearest Neighbor Data Mining Method to Extract Maximum Information Content from Weather Forecasts from Multiple Predictors of Weather and One Predictand (Low-Level Turbulence). [R] . Keller, D. L. 2014

机译：使用随机最近邻数据挖掘方法从天气和一个预测的多个预测因子（低水平湍流）的天气预报中提取最大信息内容。

Extracting Various Types of Informative Web Content via Fuzzy Sequential Pattern Mining

摘要

著录项

相似文献

相关主题

期刊订阅