首页>
外国专利>
METHODS AND APPARATUS TO IDENTIFY A COUNT OF N-GRAMS APPEARING IN A CORPUS
METHODS AND APPARATUS TO IDENTIFY A COUNT OF N-GRAMS APPEARING IN A CORPUS
展开▼
机译:识别语料库中出现的N字的数量的方法和装置
展开▼
页面导航
摘要
著录项
相似文献
摘要
Methods, apparatus, systems and articles of manufacture to identify a count of n-grams appearing in a corpus are disclosed herein. An example method includes identifying a token that frequently begins a suffix found in the corpus. First suffixes and second suffixes are identified within the corpus, the first suffixes begin with the token and the second suffixes do not begin with the token. A first counting algorithm is performed to identify a first count of n-grams appearing in the first suffixes. A second counting algorithm is performed to identify a second count of n-grams appearing in the second suffixes. The second counting algorithm is different from the first counting algorithm.
展开▼