Article published in journal

26 Februar 2026

Profile images from social networks are a valuable source of data for AI analytics, but they contain biometric identifiers that pose serious privacy risks. Current face anonymization techniques often destroy semantic information, and generative de-identification methods are vulnerable to re-identification attacks.

Yeong Su Lee, Hendrik Bothe and Michaela Geierhos propose in their article "Template-Driven Multimodal Face Pseudonymization for Privacy-Preserving Big Data Analytics" a template-driven, multimodal face pseudonymization framework that allows for the privacy-preserving analysis of facial image data while retaining analytically relevant attributes. The approach uses a FaceNet-based CelebA attribute classifier to extract fine-grained facial attributes and a DeepFace model to extract high-level demographic attributes. Rather than relying on stochastic large language models, they introduce deterministic, template-based, attribute-to-text conversion to ensure consistency and reproducibility and prevent unintended attribute hallucination. The resulting textual description serves as the sole conditioning input for Janus-Pro, a multimodal text-to-image generation model that synthesizes realistic yet non-identifiable face images. They evaluate their method on the CelebA dataset under a strong adversarial threat model, employing state-of-the-art face recognition systems to assess re-identification and linkability attacks. Their results demonstrate a substantial reduction in identity leakage while preserving semantic attributes.

The article appeared in the special issue "Blockchain and Big Data Analytics: AI-Driven Data Science" of the journal “Algorithms" (https://doi.org/10.3390/a19030176).


Source: Yeong Su Lee and Hendrik Bothe / RI CODE

 

< Zur Newsübersicht