Skip to main content

Backup Retention

Operator-managed retention is opt-in. The operator does not delete backup data unless spec.retention.enabled: true is set on a KafkaBackup.

Retention runs after a backup completes successfully. It deletes complete backup sets only, not individual segment files inside a retained backup. This keeps restore metadata consistent and avoids partial point-in-time restore windows.

Supported Backends

Storage backendOperator-managed retention
PVC/localSupported
S3 and S3-compatible storageSupported
Azure Blob StorageSupported
GCSNot currently supported; use a GCS lifecycle policy

Use storage lifecycle policies instead of operator-managed retention when you need backend-native object lock, legal hold, cross-account enforcement, or GCS support.

Retention Fields

spec:
retention:
enabled: true
maxAgeDays: 30
keepLast: 3
dryRun: true
FieldDefaultDescription
enabledfalseEnables operator-managed pruning
maxAgeDaysunsetDeletes complete backup sets older than this many days
keepLastunsetKeeps at least this many newest backup sets
dryRunfalseReports eligible backup sets without deleting data

When retention is enabled, set at least one of maxAgeDays or keepLast. Values must be greater than 0.

The current backup ID is always retained. If only maxAgeDays is set, the operator still keeps at least the newest backup set as a safety guard.

Start with dryRun: true:

apiVersion: kafka.oso.sh/v1alpha1
kind: KafkaBackup
metadata:
name: production-nightly
namespace: kafka-backup
spec:
schedule: "0 0 2 * * * *"
stopAtCurrentOffsets: true

kafkaCluster:
bootstrapServers:
- kafka:9092

topics:
- orders
- payments

storage:
storageType: s3
s3:
bucket: kafka-backups
region: us-west-2
prefix: production/nightly
credentialsSecret:
name: s3-credentials

retention:
enabled: true
maxAgeDays: 30
keepLast: 3
dryRun: true

After a backup run completes, inspect the retention status:

kubectl get kafkabackup production-nightly -n kafka-backup \
-o jsonpath='{.status.retentionInspectedBackups}{" inspected, "}{.status.retentionEligibleBackups}{" eligible, "}{.status.retentionDeletedBackups}{" deleted, dryRun="}{.status.retentionDryRun}{"\n"}'

Check for retention errors:

kubectl get kafkabackup production-nightly -n kafka-backup \
-o jsonpath='{.status.retentionError}{"\n"}'

Switch to active deletion only after the dry-run output matches the intended policy:

spec:
retention:
enabled: true
maxAgeDays: 30
keepLast: 3
dryRun: false

Status Fields

FieldDescription
lastRetentionTimeTime of the last retention run
retentionInspectedBackupsNumber of backup sets considered for this KafkaBackup
retentionEligibleBackupsNumber of backup sets selected by policy
retentionDeletedBackupsNumber of backup sets deleted
retentionReclaimedBytesBytes reclaimed, or bytes that would be reclaimed during dry run
retentionDryRunWhether the last retention run was a dry run
retentionErrorLast retention error, if pruning failed after a successful backup

Retention errors do not make a successful backup fail. Alert on retentionError if retention is part of your cost or compliance controls.

Continuous Backups

For long-running continuous: true backups, retention does not prune while the backup engine is still running. Use scheduled snapshot or incremental scheduled backups when you need recurring retention enforcement by the operator.

Restore Impact

Deleting a backup set removes that backup ID from the restoreable history. Keep maxAgeDays and keepLast aligned with recovery objectives, audit requirements, and any legal hold process.

Do not enable operator-managed retention on backup prefixes that are also managed by an incompatible external cleanup process. If both the operator and the storage backend enforce retention, make sure the policies have the same restore window.