This paper mainly focuses on estimating the relatedness and similarities between any two Wikipedia [1] articles. This paper describes various ways of determining the similarities. We hypothesize that by using some kind of properties of the Wikipedia articles, which can be internal or external, we can estimate the relatedness between Wikipedia articles. Each article is believed to have some kinds of internal properties and some external properties. Internal properties are those which are embedded inside the articles. It can be, for instance, have something to do with the content and text of the articles. External properties are those which are deduced or inferred from the articles. It can be, for example, the topic of the articles or even the closest distance between the two articles when plotted in a graph or in a category hierarchy. External properties include the properties associated with individual articles like topics (as mentioned), categories of the articles. Other techniques which are relevant when comparing the Wikipedia articles are cosine similarity, Jaccard similarity measure etc.
展开▼