Multi-stage modeling of HTML documents.

机译：HTML文档的多阶段建模。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The goal of this thesis is to first give the reader an accurate picture of several models of both information discovery and extraction within the World Wide Web and how those two processes are becoming increasingly interrelated in overall information analysis. Furthermore, it will investigate how a sophisticated analysis of visual documents, such as those on the Web, is becoming increasingly important in both finding and understanding the context of document information. The thesis presents several problems within document analysis and then tries to approximate solutions to those problems in a general analysis framework, which is implemented in a prototype application. Finally, an instance of the framework is used to demonstrate its own practicality by accumulating statistics on features of web documents such as script and style usage that are only discovered by a deeper document analysis.

机译：本文的目的是首先为读者提供有关万维网内信息发现和提取的几种模型的准确图片，以及这两种过程在整体信息分析中如何变得越来越相互关联。此外，它将研究如何对可视文件（例如Web上的可视文件）进行复杂的分析在查找和理解文档信息的上下文中变得越来越重要。本文提出了文档分析中的几个问题，然后尝试在通用分析框架中近似解决这些问题的解决方案，该框架是在原型应用程序中实现的。最后，该框架的一个实例用于通过累积有关Web文档功能的统计信息（例如脚本和样式使用情况）来证明其实用性，而这些统计信息只有通过更深入的文档分析才能发现。

著录项

作者
Levering, Ryan Reed.;
展开▼
作者单位

State University of New York at Binghamton.;

展开▼
授予单位 State University of New York at Binghamton.;
学科 Computer science.;Information science.
学位 M.S.
年度 2004
页码 76 p.
总页数 76
原文格式 PDF
正文语种 eng
中图分类水产、渔业;
关键词

相似文献

外文文献
中文文献
专利

1. Influence of orthogonalization procedure on astrophysical S-factor for the direct $α + d \to^{6}$ Li $+$ $γ$ capture process in a three-body model [J] . E. M. Tursunov, A. S. Kadyrov International Journal of Modern Physics: Conference Series . 2019,第a期

机译：正交化程序对直接的天体物理S因子的影响<！ - $ {mathjax tex-ams-mml_htmlormml} - > $α+d\to6li <！ - $ {mathjax tex-ams-mml_htmlormml} - >+<！ - - $ {mathjax tex-ams-mml_htmlormml} - >γ捕获过程三体模型$
2. A freely generated ring for $N$ $$ mathcal{N} $$ = 1 models in class $S_{k}$ $$ {mathcal{S}}_k $$ [J] . Shlomo S. Razamat, Evyatar Sabag The journal of high energy physics . 2018,第7期

机译：一个自由生成的环，用于 $n$$ mathcal {n} $$ = 1类中的型号 sk$$ { mathcal {s}} _ k $$$
3. $mathcal{IMA}$ – $mathcal{CID}$ : an integrated modeling approach for developing educational modules [J] . Ellen Francine Barbosa, José Carlos Maldonado Brazilian Computer Society. Journal . 2011,第4期

机译： $ mathcal {IMA} $ – $ mathcal {CID} $ ：开发教育模块的集成建模方法
4. A Belief Networks-Based Generative Model for Structured Documents. An Application to the XML Categorization [C] . Ludovic Denoyer, Patrick Gallinari Machine Learning and Data Mining in Pattern Recognition . 2003

机译：基于Belief网络的结构化文档生成模型。 XML分类应用
5. Context-based content extraction of HTML documents. [D] . Gupta, Suhit. 2006

机译：HTML文档的基于上下文的内容提取。
6. Datasets on mathematical modeling of multi-product multi-stage production to analyze the relationship between production yield demand and costs [O] . Işılay Talay, Öznur Özdemir-Akyıldırım 2019

机译：多产品多阶段生产的数学建模数据集以分析生产产量需求和成本之间的关系
7. Relational models for visual understanding of graphical documents. Application to architectural drawings. [O] . de las Heras Lluís-Pere 2015

机译：用于可视化理解图形文档的关系模型。适用于建筑图纸。
8. New generation of software. Modeling of energy demands for residential ventilation with HTML interface [R] . Forowicz, T. 1997

机译：新一代软件。基于HTmL界面的住宅通风能量需求建模

Multi-stage modeling of HTML documents.

摘要

著录项

相似文献

相关主题

期刊订阅