Skip to main content

Example: GDPR Compliance Evidence

Demonstrate compliance with GDPR Article 32 — "regularly testing, assessing, and evaluating the effectiveness of technical and organisational measures" — by generating signed evidence that your Kafka backups containing personal data can be restored.

Scenario

You process personal data (user events, consent records, DSAR requests) through Kafka. Your DPO needs:

  • Monthly proof that backups of personal data can be restored
  • Evidence of restore capability (RTO demonstration)
  • Shorter retention than SOX (1 year, not 7)
  • PITR validation to prove data can be recovered at any point in time

Step 1: Backup Personal Data Topics

gdpr-backup.yaml
mode: backup
backup_id: "personal-data-monthly"

source:
bootstrap_servers:
- kafka-eu:9092
security:
security_protocol: SASL_SSL
sasl_mechanism: SCRAM-SHA-512
sasl_username: backup-service
sasl_password: "${KAFKA_PASSWORD}"
topics:
include:
- user-events
- consent-records
- dsar-requests
- "pii-*"

storage:
backend: gcs
bucket: gdpr-kafka-backups
prefix: eu-west-1/monthly

backup:
compression: zstd
include_offset_headers: true
stop_at_current_offsets: true
$ kafka-backup backup --config gdpr-backup.yaml

Step 2: Restore and Validate with PITR

The key GDPR differentiator: prove you can restore to a specific point in time.

gdpr-restore.yaml
mode: restore
backup_id: "personal-data-monthly"

target:
bootstrap_servers:
- validation-kafka-eu:9092

storage:
backend: gcs
bucket: gdpr-kafka-backups
prefix: eu-west-1/monthly

restore:
# Restore data from a specific 24-hour window
time_window_start: 1711843200000 # 2026-03-31T00:00:00Z
time_window_end: 1711929600000 # 2026-04-01T00:00:00Z
create_topics: true
$ kafka-backup restore --config gdpr-restore.yaml

Step 3: Validate with PITR Timestamp

gdpr-validation.yaml
backup_id: "personal-data-monthly"

storage:
backend: gcs
bucket: gdpr-kafka-backups
prefix: eu-west-1/monthly

target:
bootstrap_servers:
- validation-kafka-eu:9092

pitr_timestamp: 1711929600000 # The PITR point we restored to

checks:
message_count:
enabled: true
mode: exact
topics:
- user-events
- consent-records
- dsar-requests
offset_range:
enabled: true

evidence:
formats: [json, pdf]
signing:
enabled: true
private_key_path: "/etc/kafka-backup/gdpr-signing-key.pem"
storage:
prefix: "evidence-reports/gdpr/"
retention_days: 365 # 1 year (GDPR typical)

notifications:
slack:
webhook_url: "${SLACK_DPO_CHANNEL}"

triggered_by: "monthly-gdpr-validation"
$ kafka-backup validation run --config gdpr-validation.yaml

Expected output:

=== Validation Results ===
Overall: PASSED
Checks: 2/2 passed, 0 failed, 0 skipped
Duration: 23ms

[PASSED] MessageCountCheck — 3 topics; 142,891 messages expected, 142,891 restored; 0 discrepancies
[PASSED] OffsetRangeCheck — 9 partitions checked; 9 passed; 0 issues

What the Evidence Report Contains

The compliance_mappings.gdpr_art32 section demonstrates Article 32 compliance:

{
"gdpr_art32": {
"control": "Article 32 - Testing technical measures",
"satisfied_by": ["MessageCountCheck", "OffsetRangeCheck"],
"test_frequency": "on-demand",
"rto_demonstrated_seconds": 47
}
}

The pitr_timestamp in the backup section proves point-in-time recovery capability:

{
"backup": {
"pitr_timestamp": 1711929600000,
"total_records": 142891
}
}

Schedule Monthly Validation

/etc/cron.d/gdpr-validation
# Run on the 1st of every month at 03:00 UTC
0 3 1 * * kafka-backup validation run --config /etc/kafka-backup/gdpr-validation.yaml

DSAR Response Workflow

When you receive a Data Subject Access Request (DSAR) and need to prove what data existed at a specific time:

# 1. Auditor specifies the exact point in time
$ kafka-backup validation run \
--config gdpr-validation.yaml \
--pitr 1709251200000 \
--triggered-by "DSAR-2026-0142 - data subject request for user ID 12345"

# 2. The evidence report includes the triggered_by field
# proving chain of custody for the specific DSAR

Data Minimisation

GDPR requires data minimisation. Set evidence retention to match your data retention policy:

evidence:
storage:
retention_days: 365 # 1 year — matches your GDPR data retention
tip

For GDPR workloads, avoid storing personal data in the evidence report itself. The report contains only metadata (topic names, record counts, timestamps) — not the actual message content.

Next Steps