We improve the method of relation extraction using subsequence kernel by adjusting the condition judging whether two words are equivalent and data preprocessing. The traditional subsequence methods suffer a decrease on performance of the less reliable sentence and multi-entity sentence, and their experiment only works on relatively ideal corpus, where there are exactly two entities in each sentence. We apply this method of relation extraction to a system visualize the relation between entries on a wiki web site, where the content is edited by users, and multi-entity sentences are common. In our method, before computing the kernel as usual, we filter the pairs of entities according to the entity class and the related relation type. Our experiment, which will consider the situation where there are more than two entities in one single sentence, demonstrates the advantage of this approach on the low-quality corpus.
展开▼