This paper presents a method for Entity Disambiguation in Information Extraction from different sources in the web. Once entities and relations between them are extracted, it is needed to determine which ones are referring to the same real-world entity. We model the problem as a graph partitioning problem in order to combine the available information more accurately than a pairwise classifier. Moreover, our method handle uncertain information which turns out to be quite helpful. Two algorithms are trained and compared, one probabilistic and the other deterministic. Both are tuned using genetic algorithms to find the best weights for the set of constraints. Experiments show that graph-based modeling yields better results using uncertain information.
展开▼