This paper proposes a new disambiguation method for Japanese text input. This method evaluates candidate sentences by measuring the number of Word Co-occurrence Patterns (WCP) included in the candidate sentences. An automatic WCP extraction method is also developed. An extraction experiment using the example sentences from dictionaries confirms that WCP can be collected automatically with an accuracy of 98.7% using syntactic analysis and some heuristic rules to eliminate erroneous extraction. Using this method, about 305,000 sets of WCP are collected. A cooccurrence pattern matrix with semantic categories is built based on these WCP. Using this matrix, the mean number of candidate sentences in Kana-to-Kanji translation is reduced to about 1/10 of those from existing morphological methods.
展开▼