首页> 外文会议>Focused Access to XML Documents >Probabilistic Methods for Structured Document Classification at INEX'07

【24h】

Probabilistic Methods for Structured Document Classification at INEX'07

机译：INEX'07的结构化文档分类的概率方法

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper exposes the results of our participation in the Document Mining track at INEX'07. We have focused on the task of classification of XML documents. Our approach to deal with structured document representations uses classification methods for plain text, applied to flattened versions of the documents, where some of their structural properties have been translated to plain text. We have explored several options to convert structured documents into flat documents, in combination with two probabilistic methods for text categorization. The main conclusion of our experiments is that taking advantage of document structure to improve classification results is a difficult task.

机译：本文展示了我们参与INEX'07文档挖掘活动的结果。我们专注于XML文档的分类任务。我们处理结构化文档表示形式的方法是对纯文本使用分类方法，该方法适用于文档的拼合版本，其中一些结构属性已转换为纯文本。我们结合两种用于文本分类的概率方法，探索了几种将结构化文档转换为平面文档的方法。我们的实验的主要结论是，利用文档结构来改善分类结果是一项艰巨的任务。

著录项

来源
《Focused Access to XML Documents》|2007年|P.195-206|共12页
会议地点 Dagstuhl Castle(DE);Dagstuhl Castle(DE)
作者
Luis M. de Campos; Juan M. Fernandez-Luna; Juan F. Huete; Alfonso E. Romero;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类信息处理（信息加工）;
关键词
入库时间 2022-08-26 14:06:49

相似文献

外文文献
中文文献
专利

1. Bag-of-Concepts representation for document classification based on automatic knowledge acquisition from probabilistic knowledge base [J] . Knowledge-Based Systems . 2020,第Apra6期

机译：基于从概率知识库中自动获取知识的文档分类的概念包表示
2. Document representation based on probabilistic word clustering in customer-voice classification [J] . Lee Younghoon, Song Seokmin, Cho Sungzoon, Pattern Analysis and Applications . 2019,第1期

机译：客户语音分类中基于概率词聚类的文档表示
3. Classification of Text Documents Based on a Probabilistic Topic Model [J] . Scientific & Technical Information Processing . 2019,第5期

机译：基于概率主题模型的文本文档分类
4. Probabilistic Methods for Structured Document Classification at INEX'07 [C] . Luis M. de Campos, Juan M. Fernandez-Luna, Juan F. Huete, International Workshop of the Initiative for the Evaluation of XML Retrieval . 2008

机译：INEX'07的结构化文件分类的概率方法
5. Metadata Matters: Adaptation Methods for Robust Document Classification [D] . Huang, Xiaolei. 2020

机译：元数据重要：适用于强大的文档分类方法
6. Assisting nurses in care documentation: from automated sentence classification to coherent document structures with subject headings [O] . Hans Moen, Kai Hakala, Laura-Maria Peltonen, 2020

机译：协助护理文档的护士：从自动句子分类到主题标题的连贯文件结构
7. Probabilistic Methods for Structured Document Classification at INEX’07 [O] . Luis M. De Campos, Juan M. Fernández-luna, Juan F. Huete, 2010

机译：INEX'07结构化文件分类的概率方法

Probabilistic Methods for Structured Document Classification at INEX'07

摘要

著录项

相似文献

相关主题

期刊订阅