This work studies methods of annotating Web tables for semantic indexing and search - labeling table columns with semantic type information and linking content cells with named entities. Built on a state-of-the-art method, the focus is placed on developing and evaluating methods able to achieve the goals with partial content sampled from the table as opposed to using the entire table content as typical state-of-the-art methods would otherwise do. The method starts by annotating table columns using a sample automatically selected based on the data in the table, then using the type information to guide content cell disambiguation. Different methods of sample selection are introduced, and experiments show that they contribute to higher accuracy in cell disambiguation, comparable accuracy in column type annotation but with reduced computational overhead.
展开▼