...
首页> 外文期刊>International Journal of Advanced Computer Research >Construction of a generic stopwords list for Hindi language without corpus statistics
【24h】

Construction of a generic stopwords list for Hindi language without corpus statistics

机译:构建没有语料统计的印地语通用停用词列表

获取原文
           

摘要

Most of the research in the field of information retrieval (IR) has focused on the English language, but recently there has been a considerable amount of work and effort to develop IR systems for languages other than English. Research and experimentation in the field of IR in the Hindi language are relatively new and limited compared to the research that has been done in English, which has been dominant in the field of IR for a long while. A fundamental tool in IR is the employment of stop word lists. Stop words have no retrieval value in IR. Till now, many stop word lists have been developed for English, European and Chinese languages. However, there is no standard stop word list which has been constructed for Hindi language. In this paper an approach to construct a generic stop word list for Hindi language have been presented. Our list contains more than 800 stop words.
机译:信息检索(IR)领域中的大多数研究都集中在英语上,但是最近在为英语以外的其他语言开发IR系统方面进行了大量的工作和工作。与以英语进行的研究相比,印地语在IR领域的研究和实验相对较新,而且相对有限,英语在IR领域占主导地位已有很长一段时间。 IR中的基本工具是停用词列表的使用。停用词在IR中没有检索价值。到目前为止,已经针对英语,欧洲语言和中文语言开发了许多停用词列表。但是,没有针对印地语构建的标准停用词列表。在本文中,提出了一种构造印地语通用停用词列表的方法。我们的列表包含800多个停用词。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号