We describe a similarity calculation model called IFSM (Inherited Feature Similarity Measure) between objects (words/concepts) based on their common and distinctive features. We propose an implementation method for obtaining features based on abstracted triples extracted from a large text corpus utilizing taxonomical knowledge. This model represents an integration of traditional methods, i.e,. relation based similarity measure and distribution based similarity measure. An experiment, using our new concept abstraction method which we call the flat probability grouping method, over 80,000 surface triples, shows that the abstraction level of 3000 is a good basis for feature description.
展开▼