...
首页> 外文期刊>電子情報通信学会技術研究報告. オフィスシステム >The document categorization by the vector space model using lexical different area Co-occurrence
【24h】

The document categorization by the vector space model using lexical different area Co-occurrence

机译:向量空间模型使用词法不同区域共现的文档分类

获取原文
获取原文并翻译 | 示例
           

摘要

We're asked for high quality classification of documents, even if document sets have field polysemy. A noun pair with 0.15 to 0.35 or more Euclid distance, between two nouns which appear in the same sentence of a document, is called a lexical different area co-occurrence, and we propose the document categorization using this method. Using a newspaper an industry magazine, and a paper as a document set, we prove that, (1) We acquired about 90% of accuracy by the newspaper. (2) A lexical different area co-occurrence developed the 15% distance between fields than the single noun.
机译:即使文档集具有字段多义性,也要求我们对文档进行高质量分类。在文档的同一句子中出现的两个名词之间的欧几里德距离为0.15至0.35或更大的名词对被称为词汇异域共现,我们建议使用此方法对文档进行分类。使用报纸,行业杂志和纸作为文档集,我们证明:(1)我们通过报纸获得了大约90%的准确性。 (2)词汇上的不同区域共现形成的字段之间的距离比单个名词的15%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号