DFRWS EU 2026 - Two Papers accepted

16 Dezember 2025

Our articles on "Addressing the Dataset Gap Problem with Generative AI: Towards LLM-driven Forensic Scenarios for Dataset Generation" and "Boon or Bane: Source Camera Identification meets AI-generated Images" were accepted at the DFRWS EU 2026. The papers will be presented in Linköping, Sweden in March 2026.

 

Addressing the Dataset Gap Problem with Generative AI: Towards LLM-driven Forensic Scenarios for Dataset Generation

Authors: Michael Plankl, Thomas Göbel and Harald Baier

Abstract:

The increasing amount of incriminating data to be analysed on the one hand and the limited availability of forensic datasets on the other hand complicate forensic research as well as the development and validation of forensic tools. This challenge is often referred to as the dataset gap problem. A novel and promising approach to solve the dataset gap problem is the generation of synthetic, forensic scenarios through the application of Generative AI (GenAI) approaches like Large Language Models (LLMs). In this paper, we demonstrate how to use popular, general-purpose foundation models to generate various forensic artefacts. While emphasising the benefits of an LLM-driven dataset generation, we also address in detail inherent risks that can impair data synthesis using LLMs (e.g., hallucinations, limited explainability, stochastic model behaviour) and show how to compensate for these limitations (e.g., skilful use of prompt engineering and architectural patterns such as function calling and AI agents). In addition, we prove the practicability of our approach by enhancing a recent data synthesis framework with LLM capabilities and a user-friendly interface. Consequently, we are able to use GenAI to automatically generate configuration files for various forensically coherent scenarios and the resulting datasets. Our implementation thus demonstrates the potential of an automated, prompt-driven scenario generation process, thereby presenting a scalable solution to the shortage of forensic dataset availability.

 

Boon or Bane: Source Camera Identification meets AI-generated Images

Authors: Samantha Klier and Harald Baier

Abstract:

Linking an image to its origin is a fundamental task in digital forensics often addressed through Source Camera Identification (SCI) based on Sensor Pattern Noise (SPN). However, recent advances in AI-enhanced smartphone photography challenge the reliability of SPN. On the other hand, noise-based identification approaches have been successfully transferred to AI-generated images. Therefore, we investigate whether the noise patterns of AI-generated images interfere with those of modern smartphones and analyze the implications for standard procedures. Our empirical evaluation reveals that the noise in AI-generated images is not predominantly additive, contradicting prior assumptions. Furthermore, we show that fingerprints of AI image generators can identify corresponding images only when the prompted resolution matches. Additionally, the standard PCE threshold leads to high false-positive rates — 61% for Adobe Firefly Image 4 and 100% for ChatGPT 5 — when comparing AI images to smartphone fingerprints. We demonstrate that simple center-cropping effectively eliminates these false positives without reducing true-positive identification performance. Our findings highlight the need for updated forensic methodologies due to the influence of software on imaging pipelines.