首页> 美国政府科技报告 >Developing a Corpus Specific Stoplist Using Quantitative Comparison
【24h】

Developing a Corpus Specific Stoplist Using Quantitative Comparison

机译:使用定量比较开发语料库特定停止列表

获取原文

摘要

We have become overwhelmed with electronic information and it seems our situationis not going to improve. It is becoming increasingly common for people to work with information on a daily basis. We seem to spend more and more time looking for information, and it is taking longer because more information is available. This thesis will look at how we can provide faster access to the information we want to find. Today's requirements are closely related to searching for information using queries. At the heart of the query process is the removal of search terms having little or no significance to the search being performed. Words considered to have little significance, in terms of their searching power, called stopwords, are compiled in a stoplist. Stoplists are usually constructed from commonly occurring words in the English language. This approach is acceptable for systems handling broad categories of information. We will build a stoplist for a specific area of interest based on a specific body of linguistic data, or corpus. A stoplist developed from an Air Force corpus will be tested to see if it is more effective than a stoplist created from a general use corpus.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号