SLINT: A Schema-Independent Linked Data Interlinking System
SLINT: A Schema-Independent Linked Data Interlinking System | |
---|---|
SLINT: A Schema-Independent Linked Data Interlinking System
| |
Bibliographical Metadata | |
Subject: | Link Discovery |
Keywords: | linked data, schema-independent, blocking, interlinking |
Year: | 2012 |
Authors: | Khai Nguyen, Ryutaro Ichise, Bac Le |
Venue | OM |
Content Metadata | |
Problem: | Link Discovery |
Approach: | Weighted co-occurrence and adaptive filtering in blocking and instance matching |
Implementation: | SLINT |
Evaluation: | Accuracy Evaluation |
Abstract
Linked data interlinking is the discovery of all instances that represent the same real-world object and locate in different data sources. Since different data publishers frequently use different schemas for storing resources, we aim at developing a schema-independent interlinking system. Our system automatically selects important predicates and useful predicate alignments, which are used as the key for blocking and instance matching. The key distinction of our system is the use of weighted co-occurrence and adaptive filtering in blocking and instance matching. Experimental results show that the system highly improves the precision and recall over some recent ones. The performance of the system and the efficiency of main steps are also discussed.
Conclusion
In this paper, we present SLINT, an efficient schema-independent linked data interlinking system. We select important predicates by predicate’s coverage and discriminability. The predicate alignments are constructed and filtered for obtaining key alignments.We implement an adaptive filtering technique to produce candidates and identities. Compare with the most recent systems, SLINT highly outperforms the precision and recall in interlinking. The performance of SLINT is also very high when it takes around 1 minute to detect more than 13,000 identity pairs.
Future work
Although SLINT has good result on tested datasets, it is not sufficient to evaluate the scalability of our system, which we consider as the current limiting point because of the used of weighted co-occurrence matrix. We will investigate about a solution for this issue in our next work. Besides, we also interested in automatic configuration for every threshold used in SLINT and improving SLINT into a novel cross-domain interlinking system.
Approach
Positive Aspects: No data available now.
Negative Aspects: No data available now.
Limitations: No data available now.
Challenges: No data available now.
Proposes Algorithm: No data available now.
Methodology: No data available now.
Requirements: No data available now.
Limitations: No data available now.
Implementations
Download-page: http://ri-www.nii.ac.jp/SLINT/index.html
Access API: No data available now.
Information Representation: No data available now.
Data Catalogue: {{{Catalogue}}}
Runs on OS: No data available now.
Vendor: No data available now.
Uses Framework: No data available now.
Has Documentation URL: No data available now.
Programming Language: No data available now.
Version: No data available now.
Platform: No data available now.
Toolbox: No data available now.
GUI: No
Research Problem
Subproblem of: No data available now.
RelatedProblem: No data available now.
Motivation: No data available now.
Evaluation
Experiment Setup: 2.66Ghz quad-core CPU and 4GB of memory
Evaluation Method : Compare the system with AgreementMaker, SERIMI, and Zhishi.Links
Hypothesis: No data available now.
Description: No data available now.
Dimensions: Accuracy
Benchmark used: LinkedMDB, GeoNames
Results: SLINT system totally outperforms the others on both precision and recall. AgreementMaker has a competitive precision with SLINT on dataset D3 but this system is much lower in recall. Zhishi.Links results on dataset D3 are very high, but the F1 score of SLINT is still 0.05 higher in overall.