DOCUMENT CLUSTERING USING WORD SENSE DISAMBIGUATION

机译：使用Word Sense Dismigumation的文档群集

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In computational linguistics, word sense disambiguation (WSD) is the problem of determining in which sense a word having a number of distinct senses is used in a given sentence. This paper handles text document clustering as one of the major tasks of text processing. Document clustering is the process of finding out groups of information from the text documents and cluster these documents into the most relevant groups. Large document corpus suffers from ambiguity problems like synonyms, polysemous and other semantic relations. For this reason we perform WSD task for all terms in all documents to get the best sense to be used as document features in the clustering process. Our experimental results proved that the efficiency of document clustering using WSD increases linearly with the size of the documents dataset. Different part of speech (POS) taggers were tested to determine the best; also the effect of different window sizes on WSD task was compared.

机译：在计算语言学中，词感歧义（WSD）是确定在给定句子中使用多个不同感官的词的问题。本文将文本文档群集处理为文本处理的主要任务之一。文档群集是从文本文档中查找信息组的过程，并将这些文档集中到最相关的组中。大型文档语料库患有同义词，多园和其他语义关系等歧义问题。出于这个原因，我们对所有文档中的所有术语执行WSD任务，以获得群集过程中的最佳意义。我们的实验结果证明，使用WSD的文档聚类效率随着文件数据集的大小而线性地增加。测试了不同部分的语音（POS）标记器以确定最佳;还比较了不同窗口尺寸对WSD任务的影响。

著录项

来源
《International Conference on Software Engineering and Data Engineering》|2008年||共6页
会议地点
作者
M. S. Mostafa; M. H. Haggag; W. H. Gomaa;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311-53;
关键词

相似文献

外文文献
中文文献
专利

1. Word-Sense Disambiguation for Ontology Mapping: Concept Disambiguation using Virtual Documents and Information Retrieval Techniques [J] . Frederik C. Schadd, Nico Roos Journal on Data Semantics . 2015,第3期

机译：本体映射的词义消歧：使用虚拟文档和信息检索技术的概念消歧
2. Adaptive and hybrid context-aware fine-grained word sense disambiguation in topic modeling based document representation [J] . Wenbo Li, Einoshin Suzuki Information Processing & Management . 2021,第4期

机译：基于主题建模的文档表示中的自适应和混合上下文感知细粒度歧义歧义
3. Knowledge-based biomedical word sense disambiguation: An evaluation and application to clinical document classification [J] . GarlaV.N., BrandtC. Journal of the American Medical Informatics Association : . 2013,第5期

机译：基于知识的生物医学单词义消歧：在临床文献分类中的评估与应用
4. DOCUMENT CLUSTERING USING WORD SENSE DISAMBIGUATION [C] . M.S.Mostafa, M.H.Haggag, W.H.Gomaa 17th international conference on software engineering and data engineering . 2008

机译：使用词义歧义进行文档聚类
5. Subjectivity word sense disambiguation: A method for sense-aware subjectivity analysis. [D] . Akkaya, Cem. 2014

机译：主观性词义消歧：一种用于感知感知的主观性分析的方法。
6. Knowledge-based biomedical word sense disambiguation: an evaluation and application to clinical document classification [O] . Vijay N Garla, Cynthia Brandt 2013

机译：基于知识的生物医学单词义消歧：评估和临床文件分类中的应用。
7. Fine-Grained Word Sense Disambiguation Based on Parallel Corpora, Word Alignment, Word Clustering and Aligned Wordnets [O] . Tufis, Dan, Ion, Radu, Ide, Nancy 2005

机译：基于平行语料库，Word的细粒度词义消歧对齐，Word聚类和对齐的Wordnets
8. Word Domain Disambiguation via Word Sense Disambiguation [R] . Sanfilippo, A. 2006

机译：Word Word消歧通过Word sense消歧

DOCUMENT CLUSTERING USING WORD SENSE DISAMBIGUATION

摘要

著录项

相似文献

相关主题

期刊订阅