Scheduled Backups

Configure recurring backup schedules with the KafkaBackup CRD.

Cron Schedule Format

The operator uses the Rust cron parser. Schedules include seconds:

┌───────────── second (0 - 59)
│ ┌───────────── minute (0 - 59)
│ │ ┌───────────── hour (0 - 23)
│ │ │ ┌───────────── day of month (1 - 31)
│ │ │ │ ┌───────────── month (1 - 12)
│ │ │ │ │ ┌───────────── day of week (0 - 6) (Sunday to Saturday)
│ │ │ │ │ │ ┌───────────── year (optional)
│ │ │ │ │ │ │
* * * * * * *

Common Schedules

Schedule	Cron Expression	Description
Every hour	`0 0 * * * * *`	At minute 0 of every hour
Every 15 minutes	`0 /15 * * * *`	Every 15 minutes
Every 6 hours	`0 0 /6 * * *`	At minute 0 past every 6th hour
Daily at midnight	`0 0 0 * * * *`	Every day at 00:00
Daily at 2 AM	`0 0 2 * * * *`	Every day at 02:00
Weekly on Sunday	`0 0 0 * * 0 *`	Every Sunday at 00:00
Monthly	`0 0 0 1 * * *`	First day of month at 00:00

Scheduled Snapshot Backup

Set stopAtCurrentOffsets: true for scheduled point-in-time backups. The backup captures the current high watermarks when it starts and exits after all partitions reach them.

apiVersion: kafka.oso.sh/v1alpha1
kind: KafkaBackup
metadata:
  name: hourly-backup
  namespace: kafka-backup
spec:
  schedule: "0 0 * * * * *"
  stopAtCurrentOffsets: true

  kafkaCluster:
    bootstrapServers:
      - kafka:9092

  topics:
    - orders
    - payments

  storage:
    storageType: s3
    s3:
      bucket: kafka-backups
      region: us-west-2
      prefix: hourly
      credentialsSecret:
        name: s3-credentials

  compression: zstd
  compressionLevel: 3
  includeOffsetHeaders: true
  sourceClusterId: production

Scheduled Backups with Split-DNS Storage

For S3-compatible storage reached through VPN or split DNS, add template.pod.hostAliases. The operator copies these aliases into the CronJob pod template, so every scheduled backup pod receives the same /etc/hosts entries.

apiVersion: kafka.oso.sh/v1alpha1
kind: KafkaBackup
metadata:
  name: private-s3-hourly
  namespace: kafka-backup
spec:
  schedule: "0 0 * * * * *"
  stopAtCurrentOffsets: true

  kafkaCluster:
    bootstrapServers:
      - kafka:9092

  topics:
    - orders
    - payments

  storage:
    storageType: s3
    s3:
      bucket: kafka-backups
      region: us-east-1
      endpoint: https://s3.internal
      pathStyle: true
      credentialsSecret:
        name: s3-credentials

  template:
    pod:
      hostAliases:
        - ip: "10.10.0.5"
          hostnames:
            - s3.internal
            - minio.internal

Continuous Backup

Set continuous: true for streaming backups. In v1.0.0 and later, checkpoint.enabled does not imply continuous mode.

apiVersion: kafka.oso.sh/v1alpha1
kind: KafkaBackup
metadata:
  name: continuous-backup
  namespace: kafka-backup
spec:
  continuous: true

  kafkaCluster:
    bootstrapServers:
      - kafka:9092

  topics:
    - "*"

  storage:
    storageType: azure
    azure:
      accountName: kafkabackups123456
      container: kafka-backups
      prefix: continuous
      useWorkloadIdentity: true

  checkpoint:
    enabled: true
    intervalSecs: 30
  consumerGroupSnapshot: true

Retention

Retention is disabled by default. If spec.retention is omitted, scheduled backups continue to create new backup IDs and the operator does not delete old backup data.

Enable retention per KafkaBackup when you want the operator to prune complete backup sets after a successful run:

apiVersion: kafka.oso.sh/v1alpha1
kind: KafkaBackup
metadata:
  name: hourly-backup
  namespace: kafka-backup
spec:
  schedule: "0 0 * * * * *"
  stopAtCurrentOffsets: true

  kafkaCluster:
    bootstrapServers:
      - kafka:9092

  topics:
    - orders
    - payments

  storage:
    storageType: s3
    s3:
      bucket: kafka-backups
      region: us-west-2
      prefix: hourly
      credentialsSecret:
        name: s3-credentials

  retention:
    enabled: true
    maxAgeDays: 30
    keepLast: 3
    dryRun: true

Start with dryRun: true and inspect .status.retentionEligibleBackups, .status.retentionDeletedBackups, .status.retentionReclaimedBytes, and .status.retentionError. Change dryRun to false only after the reported policy matches your restore and compliance requirements.

Retention deletes whole backup IDs, not individual segments. The current backup ID is always retained, and if only maxAgeDays is set the operator still keeps at least the newest backup set. Use keepLast as an additional safety guard for scheduled backups.

Operator-managed retention works with PVC/local storage, S3/S3-compatible storage, and Azure Blob Storage. For GCS, or when you need backend-native legal hold or object-lock controls, use storage-level lifecycle policies scoped to the backup prefix.

For long-running continuous: true backups, retention does not prune while the backup engine is still running. Treat retention as primarily useful for scheduled and on-demand backup sets.

Incremental Scheduled Backup

Available from v0.13.5

Set checkpoint.enabled: true with stopAtCurrentOffsets: true to create incremental scheduled backups. Each scheduled run picks up where the previous one stopped, backing up only new messages. This is the recommended pattern for most scheduled backup workloads — it combines the consistency guarantees of snapshot mode with the efficiency of incremental backups.

apiVersion: kafka.oso.sh/v1alpha1
kind: KafkaBackup
metadata:
  name: incremental-hourly
  namespace: kafka-backup
spec:
  schedule: "0 0 * * * * *"
  stopAtCurrentOffsets: true

  kafkaCluster:
    bootstrapServers:
      - kafka:9092

  topics:
    - orders
    - payments

  storage:
    storageType: s3
    s3:
      bucket: kafka-backups
      region: us-west-2
      prefix: incremental-hourly
      credentialsSecret:
        name: s3-credentials

  checkpoint:
    enabled: true
    intervalSecs: 30
  compression: zstd
  compressionLevel: 3
  includeOffsetHeaders: true

With this configuration:

First run backs up all existing data, saves the offset checkpoint, exits
Subsequent runs load the checkpoint, skip already-backed-up data, and back up only new messages
The manifest is merged across runs — all segments from all runs are preserved
If a run fails, the next run resumes from the last successful checkpoint

When to use incremental vs full scheduled backups

Use incremental when your topics have high throughput and you want fast, efficient scheduled backups. Use full (without checkpoint.enabled) when you want each backup to be a standalone, self-contained snapshot — for example, for compliance evidence where each backup must independently cover a specific time window.

Multi-Tier Backup Strategy

# Tier 1: frequent critical-topic snapshots
---
apiVersion: kafka.oso.sh/v1alpha1
kind: KafkaBackup
metadata:
  name: tier1-frequent
spec:
  schedule: "0 */15 * * * * *"
  stopAtCurrentOffsets: true
  kafkaCluster:
    bootstrapServers:
      - kafka:9092
  topics:
    - orders
    - payments
  storage:
    storageType: s3
    s3:
      bucket: kafka-backups
      prefix: tier1-frequent
      region: us-west-2
      credentialsSecret:
        name: s3-credentials
  compression: lz4

---
# Tier 2: hourly all-topic snapshots
apiVersion: kafka.oso.sh/v1alpha1
kind: KafkaBackup
metadata:
  name: tier2-hourly
spec:
  schedule: "0 0 * * * * *"
  stopAtCurrentOffsets: true
  kafkaCluster:
    bootstrapServers:
      - kafka:9092
  topics:
    - "*"
  storage:
    storageType: s3
    s3:
      bucket: kafka-backups
      prefix: tier2-hourly
      region: us-west-2
      credentialsSecret:
        name: s3-credentials
  compression: zstd

One-Time Backup

For manual or on-demand backups, omit schedule.

apiVersion: kafka.oso.sh/v1alpha1
kind: KafkaBackup
metadata:
  name: manual-backup-20260413
spec:
  kafkaCluster:
    bootstrapServers:
      - kafka:9092
  topics:
    - orders
    - payments
  storage:
    storageType: s3
    s3:
      bucket: kafka-backups
      region: us-west-2
      prefix: manual
      credentialsSecret:
        name: s3-credentials
  stopAtCurrentOffsets: true

Pausing a Schedule

spec:
  suspend: true

Monitoring Schedules

kubectl get kafkabackup hourly-backup \
  -o jsonpath='{.status.nextScheduledBackup}'

kubectl get kafkabackup hourly-backup \
  -o jsonpath='{.status.lastBackupTime}'

Best Practices

Set stopAtCurrentOffsets: true for recurring point-in-time backups.
Set continuous: true only for streaming backups.
Keep includeOffsetHeaders: true when you plan to restore and reset offsets.
Use opt-in operator retention or storage lifecycle policies for backup data retention.
Enable consumerGroupSnapshot: true when consumer group recovery matters.

Next Steps

Backup Retention - Configure opt-in retention
GitOps Integration - Version control your backup configs
Secrets Guide - Configure credentials
KafkaBackup CRD - Full specification

Cron Schedule Format​

Common Schedules​

Scheduled Snapshot Backup​

Scheduled Backups with Split-DNS Storage​

Continuous Backup​

Retention​

Incremental Scheduled Backup​

Multi-Tier Backup Strategy​

One-Time Backup​

Pausing a Schedule​

Monitoring Schedules​

Best Practices​

Next Steps​