首页>
外国专利>
Methods and apparatus for identifying a count of N-grams occurring in a corpus
Methods and apparatus for identifying a count of N-grams occurring in a corpus
展开▼
机译:识别语料库中发生的N-gram计数的方法和设备
展开▼
页面导航
摘要
著录项
相似文献
摘要
Methods, apparatuses, systems, and articles of manufacture for identifying a count of N-grams occurring in a body are disclosed herein. An exemplary method involves identifying a token that often starts a suffix found in the corpus. First suffixes and second suffixes are identified within the corpus, with the first suffixes beginning with the token and the second suffix not beginning with the token. A first count algorithm is performed to identify a first count of N-grams occurring in the first suffixes. A second count algorithm is performed to identify a second count of N-grams occurring in the second suffixes. The second counting algorithm differs from the first counting algorithm.
展开▼