Data Integration and Access by Merging Ontologies and Databases
- Contact Markus Krötzsch
- November 1, 2013 – June 30, 2021
- funded by DFG
Modern information systems increasingly empower users to decide about the nature and organisation of data, traditionally the domain of application developers. At the same time, applications open up to broader user groups, emphasising collaboration and information exchange. These trends are exemplified by Wikidata, a new sister project of Wikipedia that will provide a central, multilingual database site where users manage all of Wikipedia’s factual information. Wikidata is the mainstream breakthrough for a new type of information system that brings the flexibility and dynamicity of Wikipedia to structured data management.
Wikidata is very different from classical database applications and is facing completely new problems. Without a fixed format for storing data, information is more difficult to find, errors are harder to detect, and overlapping information grows. The quality and utility of the data suffer. At the same time, the data and its format are highly dynamic, and not formally documented. To tackle this problem, the DIAMOND project aims to extend the recent approach of Ontology-Based Data Access (OBDA) to dynamic, large-scale data management platforms. OBDA uses a conceptual model, called an ontology, which describes relationships between heterogeneous information models in a way that allows computers to align and integrate data automatically. However, OBDA so far only works well for relatively small, static, and well-designed ontologies, whereas applications like Wikidata require large, dynamic, user-created ontologies.
The main goals of DIAMOND are (1) to develop the foundations of ontology languages for this new type of information system, (2) to design an integrated prototype system to show that this solution can work, and (3) to advance the methodologies for evaluating OBDA formalisms and systems. To achieve this, the project closely combines theoretical and practical research. Foundational questions from mathematical logic, database theory, and knowledge representation will be studied in close connexion to practical questions of algorithm design, automated deduction, and system architecture. Accordingly, the methods of scientific investigation involve mathematical proof as well as experimental evaluations. The motivating use case of Wikidata serves as a source of realistic requirements and as a guide to ensure the relevance of the research.
If successful, the project will push the theoretical and practical boundaries of knowledge-based systems, which might otherwise soon become major obstacles in modern information management. Wikidata provides a first major example for general trends that will affect many communities, organisations, and businesses, but the potential of this project goes far beyond this prominent use case.
Talks and Miscellaneous