首页>
外国专利>
Word extraction method and system for use in word-breaking using statistical information
Word extraction method and system for use in word-breaking using statistical information
展开▼
机译:利用统计信息进行分词的词提取方法和系统
展开▼
页面导航
摘要
著录项
相似文献
摘要
A method, computer readable medium and system are provided which collect new words for addition to a lexicon for an agglutinative language. Sentences in the agglutinative language are retrieved from documents, for example from web pages. New word candidate character strings are identified in the retrieved sentences. The identified new word candidate character strings are filtered using a combination of a plurality of statistical criteria to generate a new words list. Words from the new words list are added to the lexicon.
展开▼