Vom Data Lake zum Knwoledge Lake: Projekt STELAR

16 Januar 2023

Prof. Eirini Ntoutsi, Institut für Datensicherheit, und Marcus Knüpfer, FI CODE, haben das Projekt STELAR bei der Europäischen Kommission eingeworben.

Laufzeit: 01.09.2022 bis 31.08.2025
Förderung: European Commission – HEU-Digital, Industry and Space

STELAR will design, develop, evaluate, and showcase an innovative Knowledge Lake Management System (KLMS) to support and facilitate a holistic approach for FAIR (Findable, Accessible, Interoperable, Reusable) and AI-ready (high-quality, reliably labeled) data. The STELAR KLMS will allow to (semi-)automatically turn a raw data lake into a knowledge lake. This is achieved by (1) enhancing the data lake with a knowledge layer, and (2) developing and integrating a set of data management tools and workflows. The knowledge layer will comprise: (a) a data catalog offering automatically enhanced metadata for the raw data assets in the lake, and (b) a knowledge graph that semantically describes and interlinks these data assets using suitable domain ontologies and vocabularies. The provided tools and workflows will offer novel functionalities for: (a) data discovery and quality management; (b) data linking and alignment; and (c) data annotation and synthetic data generation. The KLMS will combine both human-in-the-loop and automatic approaches, to leverage background knowledge of domain experts while minimizing their involvement. To reduce manual effort and time, it will increase the automation of finding and selecting relevant data sources, configuring, and tuning the involved data management tools, and designing, executing, and monitoring end-to-end data processing workflows adapted to different user needs. The KLMS will include specialized tools and functions for geospatial, temporal, and textual data. An organization, ranging from a data-intensive SME to the operator of a data marketplace, will be able to use the STELAR KLMS to increase the readiness of its data assets for use in AI applications and for being shared and exchanged within a common data space. The STELAR KLMS will be pilot tested in diverse, real-world use cases in the agrifood data space, one of the nine data spaces of strategic societal and economic importance identified in the European Strategy for Data.

Bild: © gettyimages/ArtemisDiana