New Configurations for the Correction of the RDF knowledge bases

Document Type : Research Article

Authors

1 Department of Computer Engineering, Roudsar and Amlash Branch, Islamic Azad University, Roudsar, Iran.

2 Department of Computer Engineering, Alzahra University, Vanak, Tehran, Iran.

3 Department of Electrical Engineering, Amirkabir University of Technology, Hafez Ave.,15875-4413, Tehran, Iran

Abstract

In each RDF knowledge base, several errors must be corrected by correction methods. Correction methods can be divided into three classes for the correction of outliers, inconsistencies, and erroneous relations. RDF knowledge base outliers can be considered as two types of outlier entities and triples. Inconsistent triples are corrected by inconsistency correction methods and there are many erroneous relation correction methods that each of them is used for a special objective. The variety of these errors is so wide so that no correction method could be able to cover them all. Most of the correction methods have been focused only on some of these errors, so a comprehensive study is mandatory to cover all of these elements for different objectives. Nevertheless, a couple of survey articles on the RDF knowledge base correction exist, but they are out-dated and did not present different configurations of these errors for various objectives. Since there is no configuration in this field, a new general configuration of the RDF knowledge base correction for a different objective is proposed here that can cover these various errors. In this configuration, a new classification of the errors is presented in which they are divided into three classes. The correction of each class is performed in a separate step. Finally, the state-of-the-art approach of each step is identified for each objective and a different configuration of these methods will be proposed for various objectives.

Keywords

Main Subjects


1.Bollacker, K., et al. Freebase: a collaboratively created graph database for structuring human knowledge. in ACM SIGMOD international conference on Management of data. 2008. ACM.
2.Miller, G.A., WordNet: A Lexical Database for English, in Communications of the ACM. 1995.
3.Suchanek, F.M., Automated construction and growth of a large ontology. 2008, Ph.D. thesis, Saarbrücken University.
4.Wang, Q., Z. Mao, and B. Wang, Knowledge Graph Embedding: A Survey of Approaches and Applications. IEEE Transactions on Knowledge and Data Engineering, 2017(99).
5.Petrova, A., et al. Entity Comparison in RDF Graphs. 2017. Cham: Springer International Publishing.
6.Kim, H., Building a K-Pop knowledge graph using an entertainment ontology. Knowledge Management Research & Practice, 2017. 15(2): p. 305-315.
7.Paulheim, H. Machine learning with and for semantic web knowledge graphs. in Reasoning Web International Summer School. 2018. Springer.
8.Ristoski, P. and H. Paulheim, Semantic Web in data mining and knowledge discovery: A comprehensive survey. Journal of Web Semantics, 2016. 36: p. 1-22.
9.Abedini, F., M.R. Keyvanpour, and M.B. Menhaj, Correction Tower: A General Embedding Method of the Error Recognition for the Knowledge Graph Correction. International Journal of Pattern Recognition and Artificial Intelligence, 2020: p. 2059034.
10.Suchanek, F.M., M. Sozio, and G. Weikum, SOFIE: a self-organizing framework for information extraction, in Proceedings of the 18th international conference on World wide web. 2009, ACM: Madrid, Spain. p. 631-640.
11.Weikum, G., Johannes Hoffart, and Fabian Suchanek., Ten Years of Knowledge Harvesting: Lessons and Challenges. Data Engineering 2016.
12.Preda, N., et al., Active knowledge: dynamically enriching RDF knowledge bases by web services. Proceedings of the 2010 ACM SIGMOD International Conference on Management of data. 2010, Indianapolis, Indiana, USA: ACM. 399-410.
13.Abedini, F., F. Mahmoudi, and A.H. Jadidinejad. A New Disambiguation Method for Semantic Entity Extraction Using YAGO Ontology. in In proceedings of IEEE 3th International Conference on Machine Learning and Computing (ICMLC 2011). 2011. Singapore.
14.Abedini, F., F. Mahmoudi, and A.H. Jadidinejad, From Text to Knowledge: Semantic Entity Extraction using YAGO Ontology. International Journal of Machine Learning and Computing, 2011. 1(2): p. 113-119.
15.Abedini, F., F. Mahmoudi, and S.M. Mirhashem, Using Semantic Entity Extraction Method for a New Application. International Journal of Machine Learning and Computing, 2012. 2(2).
16.Abedini, F. and M. Mirhashem, SESR: Semantic Entity Extraction for Computing Semantic Relatedness, in International Conference on Advanced Computer Theory and Engineering, 4th (ICACTE 2011). 2011, ASME Press: Dubai.
17.Tickoo, O. and R. Iyer, Knowledge and Ontologies, in Making Sense of Sensors: End-to-End Algorithms and Infrastructure Design from Wearable-Devices to Data Center. 2017, Apress: Berkeley, CA. p. 83-94.
18.Balazevic, I., C. Allen, and T.M. Hospedales, Hypernetwork Knowledge Graph Embeddings. arXiv preprint arXiv:1808.07018, 2018.
19.Guan, N., D. Song, and L. Liao, Knowledge graph embedding with concepts. Knowledge-Based Systems, 2018.
20.Wang, K., et al., Knowledge Graph Embedding with Entity Neighbors and Deep Memory Network. arXiv preprint arXiv:1808.03752, 2018.
21.Zhu, J.-Z., et al., Modeling the Correlations of Relations for Knowledge Graph Embedding. Journal of Computer Science and Technology, 2018. 33(2): p. 323-334.
22.Zhu, Q., et al., A neural translating general hyperplane for knowledge graph embedding. Journal of Computational Science, 2019. 30: p. 108-117.
23.Lin, X., et al., Relation path embedding in knowledge graphs. Neural Computing and Applications, 2018.
24.Gao, H., et al., Triple Context-based Knowledge Graph Embedding. IEEE Access, 2018: p. 1-1.
25.Fleischhacker, D., et al., Detecting Errors in Numerical Linked Data Using Cross-Checked Outlier Detection, in The Semantic Web – ISWC 2014: 13th International Semantic Web Conference, Riva del Garda, Italy, October 19-23, 2014. Proceedings, Part I, P. Mika, et al., Editors. 2014, Springer International Publishing: Cham. p. 357-372.
26.Wienand, D. and H. Paulheim, Detecting Incorrect Numerical Data in DBpedia, in The Semantic Web: Trends and Challenges: 11th International Conference, ESWC 2014, Anissaras, Crete, Greece, May 25-29, 2014. Proceedings, V. Presutti, et al., Editors. 2014, Springer International Publishing: Cham. p. 504-518.
27.Paulheim, H., Identifying Wrong Links between Datasets by Multi-dimensional Outlier Detection. WoDOOM, 2014: p. 27-38.
28.Li, H., et al., Probabilistic Error Detecting in Numerical Linked Data, in Database and Expert Systems Applications: 26th International Conference, DEXA 2015, Valencia, Spain, September 1-4, 2015, Proceedings, Part I, Q. Chen, et al., Editors. 2015, Springer International Publishing: Cham. p. 61-75.
29.Aggarwal, C.C., Outlier Detection in Graphs and Networks, in Outlier Analysis. 2017, Springer International Publishing: Cham. p. 369-397.
30.Abedini, F., M.B. Menhaj, and M.R. Keyvanpour, EPCI: An Embedding method for Post-Correction of Inconsistency in the RDF Knowledge Bases. IETE Journal of Research, 2019.
31.Melo, A. and H. Paulheim, Detection of Relation Assertion Errors in Knowledge Graphs, in Proceedings of the Knowledge Capture Conference. 2017, ACM: Austin, TX, USA. p. 1-8.
32.Heiko, P. and B. Christian, Improving the Quality of Linked Data Using Statistical Distributions. International Journal on Semantic Web and Information Systems (IJSWIS), 2014. 10(2): p. 63-86.
33.Melo, A. and H. Paulheim, An Approach to Correction of Erroneous Links in Knowledge Graphs, in 1st International Workshop on Quality Engineering Meets Knowledge Graph. 2017, ACM: Austin, TX. p. 1-4.
34.Wang, C., et al., Error Link Detection and Correction in Wikipedia, in Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 2016, ACM: Indianapolis, Indiana, USA. p. 307-316.
35.Fan, W., et al. Catching Numeric Inconsistencies in Graphs. in Proceedings of the 2018 International Conference on Management of Data. 2018. ACM.
36.Paulheim, H., Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic Web Preprint, 2016: p. 1-20.
37.Parzen, E., On estimation of a probability density function and mode. The annals of mathematical statistics, 1962. 33(3): p. 1065-1076.
38.Fleischhacker, D., Detecting errors in linked data using ontology learning and outlier detection. 2015.
39.Breunig, M.M., et al., LOF: identifying density-based local outliers. SIGMOD Rec., 2000. 29(2): p. 93-104.
40.Benbernou, S. and M. Ouziri. Enhancing data quality by cleaning inconsistent big RDF data. in 2017 IEEE International Conference on Big Data (Big Data). 2017.
41.Töpper, G., Magnus Knuth, and Harald Sack. DBpedia ontology enrichment for inconsistency detection. in 8th International Conference on Semantic Systems. 2012. ACM.
42.Fang, J. and Z. Huang, Reasoning with inconsistent ontologies. Tsinghua Science & Technology, 2010. 15(6): p. 687-691.
43.Dylla, M., Mauro Sozio, and Martin Theobald, Resolving Temporal Conflicts in Inconsistent RDF Knowledge Bases. BTW, 2011.
44.Nakashole, N., M. Theobald, and G. Weikum, Scalable knowledge harvesting with high precision and high recall, in Proceedings of the fourth ACM international conference on Web search and data mining. 2011, ACM: Hong Kong, China. p. 227-236.
45.Guoliang Ji, K.L., Shizhu He and Jun Zhao, Knowledge Graph Completion with Adaptive Sparse Transfer Matrix. Association for the Advancement of Artificial Intelligence, 2016.
46.Ji, G., He, S., Xu, L., Liu, K., & Zhao, J. . Knowledge Graph Embedding via Dynamic Mapping Matrix. in ACL. 2015.
47.Wang, Z., et al., Knowledge Graph Embedding by Translating on Hyperplanes. AAAI, 2014.
48.Zhang, C., Zhou, M., Han, X., Hu, Z., & Ji, Y., Knowledge graph embedding for hyper-relational data. Tsinghua Science and Technology, 2017. 22(2): p. 185-197.
49.Melo, A., J. Völker, and H. Paulheim, Type Prediction in Noisy RDF Knowledge Bases Using Hierarchical Multilabel Classification with Graph and Latent Features. International Journal on Artificial Intelligence Tools, 2017. 26(02): p. 1760011.
50.Ristoski, P., and Heiko Paulheim, Rdf2vec: Rdf graph embeddings for data mining, in International Semantic Web Conference. 2016, Springer International Publishing.
51.Andr, et al., Type Prediction in RDF Knowledge Bases Using Hierarchical Multilabel Classification, in Proceedings of the 6th International Conference on Web Intelligence, Mining and Semantics. 2016, ACM: Nîmes, France. p. 1-10.
52.Zhang, D., Learning through overcoming temporal inconsistencies, in 14th International Conference on Cognitive Informatics & Cognitive Computing. 2015, IEEE.
53.Zhang, D., and Meiliu Lu, Inconsistency-induced learning for perpetual learners. Advances in Abstract Intelligence and Soft Computing 2012.
54.Sheng, Z., et al. Checking and handling inconsistency of DBpedia. in International Conference on Web Information Systems and Mining. 2012. Springer Berlin Heidelberg.
55.Flesca, S., F. Furfaro, and F. Parisi, Querying and repairing inconsistent numerical databases. ACM Transactions on Database Systems (TODS), 2010. 35(2): p. 14.
56.Suchanek, F.M., Gjergji Kasneci, and Gerhard Weikum. Yago: a core of semantic knowledge. in Proceedings of the 16th international conference on World Wide Web. . 2007. ACM.
57.Abedini, F. and S.M. Mirhashem, Entity Disambiguation in Text by YAGO Ontology. International Journal of Computer Theory and Engineering, 2013. 5(3).
58.Bizer, C., et al. , DBpedia-A crystallization point for the Web of Data. Web Semantics: science, services and agents on the world wide web, 2009. 7(3): p. 154-165.
59.Dong, X., et al. Knowledge vault: A web-scale approach to probabilistic knowledge fusion. in Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. 2014. ACM.
60.Ma, Y., et al., Learning Disjointness Axioms With Association Rule Mining and Its Application to Inconsistency Detection of Linked Data, in The Semantic Web and Web Science: 8th Chinese Conference, CSWS 2014, Wuhan, China, August 8-12, 2014, Revised Selected Papers, D. Zhao, et al., Editors. 2014, Springer Berlin Heidelberg: Berlin, Heidelberg. p. 29-41.
61.Abedini, F. and M. Mirhashem, From text to facts: Recognizing ontological facts for a new application. International Journal of Machine Learning and Computing 2012. 2(3).
62.Abedini, F., M.R. Keyvanpour, and M.B. Menhaj, An RDF Based Fuzzy Ontology Using Neural Tensor Networks. International Journal of Information & Communication Technology Research, 2019. 11(1): p. 45-56.
63.Abedini, F., M.B. Menhaj, and M.R. Keyvanpour, An MLP-based representation of neural tensor networks for the RDF data models. Neural Computing and Applications, 2019. 31(2): p. 1135-1144.
64.Nickel, M., et al., A review of relational machine learning for knowledge graphs. Proceedings of the IEEE, 2015. 104(1).
65.Lao, N. and W.W. Cohen, Relational retrieval using a combination of path-constrained random walks. Machine Learning, 2010. 81(1): p. 53-67.
66.Lin, X., et al., A Knowledge Base Completion Model Based on Path Feature Learning. International Journal of Computers, Communications & Control, 2018. 13(1).
67.Zhang, M., et al. Discriminative Path-Based Knowledge Graph Embedding for Precise Link Prediction. 2018. Cham: Springer International Publishing.
68.Melo, A., Automatic refinement of large-scale cross-domain knowledge graphs. 2018, Ph.D thesis, Mannheim University.
69.Chang, L., et al., Knowledge Graph Embedding by Dynamic Translation. IEEE Access, 2017. 5: p. 20898-20907.
70.Goyal, P. and E. Ferrara, Graph Embedding Techniques, Applications, and Performance: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017.
71.Han, X., et al., A Triple-Branch Neural Network for Knowledge Graph Embedding. IEEE Access, 2018. 6: p. 76606-76615.
72.Kazemi, S.M. and D. Poole, SimplE Embedding for Link Prediction in Knowledge Graphs. CoRR, 2018. abs/1802.04868.
73.Kristiadi, A., et al., Incorporating Literals into Knowledge Graph Embeddings. arXiv preprint arXiv:1802.00934, 2018.
74.Wang, M., et al. Embedding Knowledge Graphs Based on Transitivity and Asymmetry of Rules. 2018. Cham: Springer International Publishing.
75.Bordes, A., et al, Translating embeddings for modeling multi-relational data. Advances in Neural Information Processing Systems, 2013.
76.Abedini, F., Menhaj, M. B., & Keyvanpour, M. R., Neuron Mathematical Model Representation of Neural Tensor Network for RDF Knowledge Base Completion. Journal of Computer & Robotics, 2017. 10(1): p. 1-10.
77.Abedini, F., M.R. Keyvanpour, and M.B. Menhaj, Neural Tensor Network Training Using Meta-Heuristic Algorithms for RDF Knowledge Bases Completion. Applied Artificial Intelligence, 2019. 33(7): p. 656-667.
78.Sohrabi, M.K. and H. Azgomi, Parallel set similarity join on big data based on Locality-Sensitive Hashing. Science of Computer Programming, 2017. 145: p. 1-12.
79.Sohrabi, M.K. and H. Azgomi, A Survey on the Combined Use of Optimization Methods and Game Theory. Archives of Computational Methods in Engineering, 2020. 27(1): p. 59-80.
80.Azgomi, H. and M.K. Sohrabi, A novel coral reefs optimization algorithm for materialized view selection in data warehouse environments. Applied Intelligence, 2019. 49(11): p. 3965-3989.