An ontology is very important in describing and sharing knowledge of a domain. This paper proposes a method to automatically generate domain ontologies from Chinese encyclopedias on the web. First, we use the terms appears in category systems of encyclopedias as concepts and resolute synonyms, then derive an original taxonomy from Chinese- Wikipedia and Hudong-Baike; for other concepts not in the original taxonomy, we use a set-theory like method to form a directed graph and generate a tree from the graph via maximum-spanning-tree algorithm, and merge the tree into the taxonomy. Then, we use titles of normal articles as instances and populate them via category labels in them. The attributes of concepts and instances are generated from special structures such as InfoBox modules. We learn a plant ontology successfully and the later experiments show that the learnt ontology has well precision and high coverage.
展开▼