Linked data has been widely recognized as an important paradigm forrepresenting data and one of the most important aspects of supporting its use isdiscovery of links between datasets. For many datasets, there is a significant amountof textual information in the form of labels, descriptions and documentation aboutthe elements of the dataset and the fundament of a precise linking is in the applicationof semantic textual similarity to link these datasets. However, most linking tools sofar rely on only simple string similarity metrics such as Jaccard scores. We presentan evaluation of some metrics that have performed well in recent semantic textualsimilarity evaluations and apply these to linking existing datasets.
展开▼