Links build the backbone of the Linked Dat β¦ Links build the backbone of the Linked Data Cloud. With the steady growth in the size of datasets comes an increased need for end users to know which frameworks to use for deriving links between datasets. In this survey, we comparatively evaluate current Link Discovery tools and frameworks. For this purpose, we outline general requirements and derive a generic</br>architecture of Link Discovery frameworks. Based on this generic architecture, we study and compare the features of state-of the-art linking frameworks. We also analyze reported performance evaluations for the different frameworks. Finally, we derive insights pertaining to possible future developments in the domain of Link Discovery.elopments in the domain of Link Discovery. +
We investigated ten LD frameworks and comp β¦ We investigated ten LD frameworks and compared their functionality based on a common set of criteria. The criteria cover the main steps such as the configuration of linking specifications and methods for matching and runtime optimization. We also covered general aspects such as the supported input formats and link types, support for a GUI and software availability as open source. We observed that the considered tools already provide a rich functionality with support for semi-automatic configuration including advanced learning-based approaches such as unsupervised genetic programming or active learning. On the other side, we found that most tools still focus on simple property-based match techniques rather than using the ontological context within structural matchers. Furthermore, existing links and background knowledge are not yet exploited in the considered frameworks. More comprehensive support of efficiency techniques is also necessary such as the combined use of blocking, filtering and parallel processing. We also analyzed comparative evaluations of the LD frameworks to assess their relative effectiveness and efficiency. In this respect, the OAEI instance matching track is the most relevant effort and we thus analyzed its match tasks and the tool participation and results for the last years. Unfortunately, the participation has been rather low thereby preventing the comparative evaluation between most of the tools. Moreover, the focus of the contest has been on effectiveness so far while runtime efficiency has not yet been evaluated. To better assess the relative effectiveness and efficiency of LD</br>tools it would be valuable to test them on a common set of benchmark tasks on the same hardware. Given the general availability of the tools and the existence of a considerable set of match task definitions and datasets this should be feasible with reasonable effort.should be feasible with reasonable effort. +