Backup Validation & Compliance Evidence
Automatically validate that your Kafka backups can be restored correctly and generate cryptographically signed evidence reports for auditors.
Overview
The validation suite runs checks against a restored Kafka cluster and compares the results against the original backup manifest. It produces:
- JSON evidence reports — machine-readable, deterministic, suitable for automation
- PDF evidence reports — auditor-ready, branded, suitable for direct submission
- Detached signatures — ECDSA-P256-SHA256 cryptographic proof of report integrity
- Compliance mappings — automatic mapping to SOX ITGC, CMMC RE.3.139, and GDPR Article 32
Prerequisites
- OSO Kafka Backup installed (v0.11.0+)
- An existing backup in object storage or filesystem
- A Kafka cluster with the backup data restored (the "target" cluster)
- Optional: OpenSSL for generating signing keys
The validation tool does not perform the restore itself. Run kafka-backup restore first, then validate the result. This separation ensures the validation is an independent check.
Step 1: Create a Validation Config
Create validation.yaml:
# Backup to validate against
backup_id: "production-daily-001"
# Where the backup is stored
storage:
backend: s3
bucket: my-kafka-backups
region: us-west-2
prefix: production/daily
# The restored Kafka cluster to validate
target:
bootstrap_servers:
- restored-kafka:9092
# Which checks to run
checks:
message_count:
enabled: true
mode: exact # exact | sample
offset_range:
enabled: true
consumer_group_offsets:
enabled: false # Enable if consumer groups were restored
# Evidence report settings
evidence:
formats:
- json
- pdf
storage:
prefix: "evidence-reports/"
retention_days: 2555 # ~7 years (SOX requirement)
Step 2: Run Validation
$ kafka-backup validation run --config validation.yaml
You'll see output like:
=== Validation Results ===
Overall: PASSED
Checks: 2/2 passed, 0 failed, 0 skipped
Duration: 18ms
[PASSED] MessageCountCheck — 3 topics; 1000 messages expected, 1000 restored; 0 discrepancies
[PASSED] OffsetRangeCheck — 9 partitions checked; 9 passed; 0 issues
JSON evidence uploaded: evidence-reports/validation-9275b4aa/2026/04/validation-9275b4aa.json
PDF evidence uploaded: evidence-reports/validation-9275b4aa/2026/04/validation-9275b4aa.pdf
The command exits with code 0 on success, code 1 if any check fails.
Ad-hoc Auditor-Triggered Runs
When an auditor requests a specific point-in-time validation:
$ kafka-backup validation run \
--config validation.yaml \
--pitr 1711929600000 \
--triggered-by "KPMG Q1 2026 audit"
The --triggered-by string is recorded in the evidence report, providing a clear chain of custody.
Step 3: Review the Evidence Report
The JSON evidence report contains:
- Backup metadata — ID, source cluster, topics, partitions, record counts
- Validation results — per-check pass/fail with machine-readable data
- Integrity information — SHA-256 checksums, signature algorithm
- Compliance mappings — which checks satisfy which regulatory controls
# List available evidence reports
$ kafka-backup validation evidence-list --path s3://my-kafka-backups
# Download a specific report
$ kafka-backup validation evidence-get \
--path s3://my-kafka-backups \
--report-id validation-9275b4aa \
--format json \
--output evidence-report.json
Step 4: Generate a PDF Report
Include pdf in the formats list:
evidence:
formats:
- json
- pdf
The PDF contains:
- Page 1 — Cover page with overall result (PASSED/FAILED), report ID, timestamp
- Page 2 — Validation check results table
- Page 3 — Integrity details and compliance framework mappings (SOX, CMMC, GDPR)
Step 5: Sign the Evidence Report
See the Evidence Signing Guide for detailed key management instructions.
Quick setup:
# Generate an ECDSA-P256 key pair
$ openssl ecparam -genkey -name prime256v1 -noout | \
openssl pkcs8 -topk8 -nocrypt -out signing-key.pem
$ openssl ec -in signing-key.pem -pubout -out signing-key-pub.pem
Add to your config:
evidence:
signing:
enabled: true
private_key_path: "/etc/kafka-backup/signing-key.pem"
The signed report produces a .sig file alongside the JSON and PDF.
Step 6: Verify the Signature
$ kafka-backup validation evidence-verify \
--report evidence-report.json \
--signature evidence-report.sig \
--public-key signing-key-pub.pem
Report ID: validation-9275b4aa-2aeb-4910-a3a6-9e4aa1dc016a
Algorithm: ECDSA-P256-SHA256
Report SHA-256: 2482bbdfa113146e39a4884767002554...
SHA-256 checksum: VALID
ECDSA signature: VALID
Evidence report integrity: VERIFIED
Step 7: Set Up Notifications
Get alerted when validation passes or fails:
notifications:
slack:
webhook_url: "https://hooks.slack.com/services/T00/B00/xxxxx"
pagerduty:
integration_key: "your-pagerduty-integration-key"
severity: critical # Triggers on failure only
Slack receives a Block Kit message with the result, check summary, and a link to the evidence report. PagerDuty receives an Events API v2 trigger on failure and auto-resolves on the next success.
Validation Checks
MessageCountCheck
Compares per-partition record counts between the backup manifest and the restored cluster. Fails if any partition has more discrepancies than fail_threshold.
checks:
message_count:
enabled: true
mode: exact # exact: all partitions | sample: random subset
sample_percentage: 100
topics: [] # Empty = all topics in the backup
fail_threshold: 0 # 0 = fail on any discrepancy
OffsetRangeCheck
Verifies that the high watermark and low watermark for each partition in the restored cluster match the backup manifest's segment offset ranges.
checks:
offset_range:
enabled: true
verify_high_watermark: true
verify_low_watermark: true
ConsumerGroupOffsetCheck
Verifies that consumer group offsets are present and valid in the restored cluster.
checks:
consumer_group_offsets:
enabled: true
verify_all_groups: true # false = only check groups listed below
groups: [] # Empty + verify_all_groups = all groups
CustomWebhookCheck
Call your own validation endpoint. The tool POSTs a JSON payload with the backup ID and restored cluster details, and expects a pass/fail response.
checks:
custom_webhooks:
- name: application-health-check
url: "https://internal.example.com/kafka-validation-hook"
timeout_seconds: 120
expected_status_code: 200
fail_on_timeout: true
Complete Configuration Example
backup_id: "production-daily-001"
storage:
backend: s3
bucket: my-kafka-backups
region: us-west-2
prefix: production/daily
target:
bootstrap_servers:
- restored-kafka-0:9092
- restored-kafka-1:9092
security:
security_protocol: SASL_SSL
sasl_mechanism: SCRAM-SHA-512
sasl_username: backup-validator
sasl_password: "${KAFKA_PASSWORD}"
checks:
message_count:
enabled: true
mode: exact
fail_threshold: 0
offset_range:
enabled: true
consumer_group_offsets:
enabled: true
verify_all_groups: true
custom_webhooks:
- name: order-service-check
url: "https://internal.example.com/validation/orders"
timeout_seconds: 120
evidence:
formats: [json, pdf]
signing:
enabled: true
private_key_path: "/etc/kafka-backup/signing-key.pem"
storage:
prefix: "evidence-reports/"
retention_days: 2555
notifications:
slack:
webhook_url: "https://hooks.slack.com/services/T00/B00/xxxxx"
pagerduty:
integration_key: "abc123def456"
severity: critical
triggered_by: "weekly-cron-job"
Next Steps
- Evidence Signing Guide — deep-dive on key management and PKI integration
- Validation Config Reference — complete option reference
- Evidence Report Schema — JSON schema documentation
- SOX Compliance Example — end-to-end SOX scenario
- GDPR Compliance Example — GDPR Article 32 scenario
- Compliance Evidence Use Cases — why and when to use this