Methods and systems for constructing a taxonomy based on hierarchical clustering are provided. The taxonomy is generated by first constructing a hierarchy of clusters using a clustering algorithm. A first level of the hierarchy of clusters is generated by providing a plurality of content files to a clustering algorithm. Subsequent levels of the hierarchy are generated by providing the clusters of the preceding levels to the clustering algorithm. Labels that characterize each cluster within the hierarchy are assigned to corresponding clusters. Labels and clusters are combined to form the taxonomy.
展开▼