Reformatting Web Documents via Header Trees

机译：通过标题树重新格式化Web文档

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose a new method for reformatting web documents by extracting semantic structures from web pages. Our approach is to extract trees that describe hierarchical relations in documents. We developed an algorithm for this task by employing the EM algorithm and clustering techniques. Preliminary experiments showed that our approach was more effective than baseline methods.

机译：我们提出了一种通过从网页中提取语义结构来重新重新格式化Web文档的新方法。我们的方法是提取描述文档中的分层关系的树木。通过采用EM算法和聚类技术，我们开发了一种为此任务的算法。初步实验表明，我们的方法比基线方法更有效。

著录项

来源
《Association for Computational Linguistics Annual Meeting》|2005年||共4页
会议地点
作者
Minoru Yoshida; Hiroshi Nakagawa; Association for Computational Linguistics(ACL); ACL-05;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算机软件;
关键词

相似文献

外文文献
中文文献
专利

1. A tree-based learning approach for document structure analysis and its application to web search [J] . F. CANAN PEMBE, TUNGA GUENGOER Natural language engineering . 2015,第augapta4期

机译：基于树的文档结构分析学习方法及其在网络搜索中的应用
2. Web crippling and combined bending and web crippling of cold-formed steel beam headers [J] . Sutton F. Stephens, Roger A. LaBoube Thin-Walled Structures . 2003,第12期

机译：冷弯钢梁联箱的腹板瘫痪以及弯曲和腹板瘫痪
3. HOUSE OF WEB SERVICES: Mandatory Headers in ASP.NET Web Services [J] . TIM EWALD MSDN Magazine . 2003,第5期

机译：Web服务之家：ASP.NET Web服务中的强制性标题
4. Reformatting Web Documents via Header Trees [C] . Minoru Yoshida, Hiroshi Nakagawa, Association for Computational Linguistics(ACL), Association for Computational Linguistics Annual Meeting . 2005

机译：通过标题树重新格式化Web文档
5. Web crippling and combined bending and web crippling of cold-formed steel header beams. [D] . Stephens, Sutton Frissell. 2002

机译：腹板瘫痪以及冷弯钢联管箱梁的弯曲和腹板瘫痪。
6. Evaluation of a Method to Identify and Categorize Section Headers in Clinical Documents [O] . Joshua C. Denny, Anderson Spickard III, Kevin B. Johnson, 2009

机译：临床文献中标头识别和分类方法的评估
7. Reformatting Web Documents via Header Trees [O] . Minoru Yoshida, Hiroshi Nakagawa 2009

机译：通过标题树重新格式化Web文档

Reformatting Web Documents via Header Trees

摘要

著录项

相似文献

相关主题

期刊订阅