The present invention is directed towards a method and system for characterizing web content based on capturing semantics of folksonomies relating to content entities of user generated content. The method and system includes determining a plurality of tags that describe a plurality of content entities and determining a co-occurrence of the tags. The method and system further includes generating weighted vectors based on the determined co-occurrence of tags and characterizing the content entity based on the weight vectors. Thereby, the characterization of the content entity may be used for any number of suitable purposes, including, by way of example, improving search results and associated advertising relevancy.
展开▼