首页> 外文会议>19th International Conference on Computational Linguistics Coling 2002 Vol.1 Aug 26-30, 2002 Taipei, Taiwan >Using Syntactic Analysis to Increase Efficiency in Visualizing Text Collections
【24h】

Using Syntactic Analysis to Increase Efficiency in Visualizing Text Collections

机译:使用句法分析提高文本集合可视化的效率

获取原文
获取原文并翻译 | 示例

摘要

Self-Organizing Maps (SOMs) are a good method to cluster and visualize large collections of documents, but they are computationally expensive. In this paper, we investigate linguistically motivated reductions on the usual bag-of-words representation, to improve efficiency. We find that reducing the document representation to heads of verb and noun phrases reduces the heavy computational cost without degrading the quality of the map, especially in combination with term reduction techniques. More severe reductions which focus on subject and object nominal phrases are not advantageous.
机译:自组织地图(SOM)是对大量文档进行聚类和可视化的一种好方法,但是它们的计算量很大。在本文中,我们研究了出于语言动机而对惯用词袋表示法的简化,以提高效率。我们发现,将文档表示形式简化为动词和名词短语的开头会减少大量的计算成本,而不会降低地图的质量,尤其是与术语简化技术结合使用时。集中于主语和宾语名词短语的更严格的减少是不利的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号