Frequently Asked Questions

Common questions about OSO Kafka Backup, organized by category.

General

What is OSO Kafka Backup?

OSO Kafka Backup is an open-source, high-performance backup and restore tool for Apache Kafka, written in Rust. It provides point-in-time recovery (PITR) for Kafka topics and consumer group offsets, supports multi-cloud storage backends (S3, Azure Blob Storage, GCS), and ships as a single static binary. The project is licensed under the MIT License.

How does OSO Kafka Backup differ from MirrorMaker 2?

MirrorMaker 2 is a replication tool designed to mirror data between live Kafka clusters in real time. OSO Kafka Backup is a backup and recovery tool designed to create durable, versioned copies of your Kafka data in external object storage. Key differences:

Capability	MirrorMaker 2	OSO Kafka Backup
Primary purpose	Cross-cluster replication	Backup and restore
Storage target	Another Kafka cluster	Object storage (S3, GCS, Azure Blob)
Point-in-time recovery	No	Yes
Offset recovery	Limited	Full consumer group offset restore
Independent of Kafka	No (requires target cluster)	Yes (stores to object storage)

Use MirrorMaker 2 for active-active or active-passive cluster topologies. Use OSO Kafka Backup for disaster recovery, compliance archival, and point-in-time restore scenarios.

Why should I use OSO Kafka Backup instead of running a mirrored standby cluster?

Running a standby cluster with MirrorMaker 2 or Confluent Replicator means paying for two full Kafka clusters around the clock, plus the operational burden of maintaining the replication pipeline. Mirror setups are fragile — they can break during Kafka version upgrades, require ongoing certificate and configuration management, and add significant infrastructure cost.

More critically, mirroring propagates deletes. If data is accidentally deleted or a bad producer pushes destructive events at 9:00 AM, that deletion is faithfully replicated to your standby cluster. You have no way to go back.

OSO Kafka Backup stores immutable, incremental snapshots in object storage (S3, GCS, Azure Blob). This gives you:

Point-in-time recovery — restore to any previous backup, not just "current state"
Delete protection — deletions are never propagated to your backups
Dramatic cost reduction — object storage is 10–100× cheaper than running a second Kafka cluster
Simpler operations — no MirrorMaker to configure, monitor, or fix after upgrades

If you need active-active replication for low-latency failover, mirroring is the right tool. If your goal is disaster recovery with the ability to restore within hours, backup to object storage is simpler, safer, and cheaper.

How does OSO Kafka Backup compare to Confluent Replicator?

Unlike Confluent Replicator, OSO Kafka Backup:

Stores backups in external object storage rather than requiring a destination Kafka cluster
Supports point-in-time recovery (PITR) to restore data to any arbitrary timestamp
Recovers consumer group offsets so applications resume from the correct position after restore
Ships as a single binary with no dependencies on the Confluent Platform or Connect framework
Is open source under the MIT License, with no per-broker licensing costs

Is OSO Kafka Backup production-ready?

Yes. OSO Kafka Backup is built in Rust for memory safety and high performance. In production environments it achieves throughput exceeding 100 MB/s and operates with less than 500 MB of memory. It includes built-in checkpointing for crash resilience, Prometheus metrics for observability, and has been validated across enterprise workloads.

What Kafka versions are supported?

OSO Kafka Backup supports any Kafka cluster that implements the Kafka protocol version 0.10 or later. This includes clusters running in both ZooKeeper mode and KRaft mode. The tool uses the standard Kafka consumer and producer APIs, so it is compatible with all Kafka distributions that adhere to the protocol.

What managed Kafka services are supported?

OSO Kafka Backup works with all major managed Kafka services, including:

Amazon MSK (both provisioned and serverless)
Confluent Cloud
Aiven for Apache Kafka
Redpanda (Kafka API-compatible)
Azure Event Hubs for Kafka (Kafka protocol endpoint)

Any service that exposes a standard Kafka protocol endpoint is supported.

OSO Kafka Backup also works with Kafka-compatible platforms that implement the Kafka wire protocol, including:

AutoMQ (cloud-native Kafka with tiered storage)
WarpStream (Kafka-compatible, zero-disk architecture)

If the platform speaks the Kafka protocol, OSO Kafka Backup can back it up.

Can I start with open source and upgrade to Enterprise later?

Yes. The open source and Enterprise editions use the same backup storage format. You can start with the OSS edition, build your backup infrastructure, and upgrade to Enterprise at any time without migrating or re-creating existing backups.

Typical triggers for upgrading to Enterprise:

You adopt a Schema Registry and need schema backup and restore
You need data masking or GDPR compliance tools (field-level redaction, right to be forgotten)
You require RBAC to control who can perform backup and restore operations
You want priority support with SLAs and a dedicated Slack channel

The Enterprise licence is applied as a configuration change — no re-deployment or data migration is required.

What is the difference between the OSS and Enterprise editions?

OSS edition includes:

Full backup and restore functionality
Point-in-time recovery (PITR)
Compression (zstd, gzip, snappy, lz4)
Prometheus metrics and monitoring
Consumer group offset backup and restore

Enterprise edition adds:

Client-side AES-256 encryption
Role-based access control (RBAC)
Audit logging
GDPR compliance tools (data masking, right to be forgotten, field-level redaction)
Schema Registry backup and restore
Priority support with SLAs

Backup Operations

How do I schedule automated backups?

There are two primary approaches:

Kubernetes CronJob -- Run kafka-backup backup with stop_at_current_offsets: true on a schedule:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: kafka-backup-scheduled
spec:
  schedule: "0 */6 * * *"  # Every 6 hours
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: kafka-backup
              image: ghcr.io/osodevops/kafka-backup:latest
              args: ["backup", "--config", "/etc/kafka-backup/config.yaml"]
          restartPolicy: OnFailure

Kafka Backup Operator -- Use a KafkaBackup resource with spec.schedule to define schedules declaratively:

apiVersion: kafka.oso.sh/v1alpha1
kind: KafkaBackup
metadata:
  name: daily-backup
spec:
  schedule: "0 0 2 * * * *"
  stopAtCurrentOffsets: true
  kafkaCluster:
    bootstrapServers:
      - kafka:9092
  topics:
    - orders
  storage:
    storageType: s3
    s3:
      bucket: kafka-backups
      region: us-west-2
      credentialsSecret:
        name: s3-credentials

Do I need continuous backup, or are scheduled (cron) backups sufficient?

It depends on your Recovery Point Objective (RPO) — how much data you can afford to lose in a disaster:

Approach	RPO	Best for
Scheduled (cron) — e.g., hourly	Up to 1 hour of data loss	Most disaster recovery scenarios
Continuous	Near-zero data loss	Mission-critical streams where every message matters

For most organisations, hourly scheduled backups provide a practical balance between protection and simplicity. If your Kafka cluster fails, you lose at most one hour of data and can restore the rest from the last backup.

Use continuous mode when your data stream is the source of truth (e.g., event-sourced architectures) and any data loss is unacceptable.

Both modes use the same incremental mechanism — only new messages since the last checkpoint are captured — so the operational cost difference is minimal.

How does OSO Kafka Backup protect against accidental deletions?

Unlike cluster mirroring, OSO Kafka Backup does not propagate deletes to your backups. Backups are immutable snapshots stored in object storage. If a bad producer pushes destructive events, a topic is accidentally deleted, or a compaction policy removes data unexpectedly, your backups remain intact.

To recover from an accidental deletion:

Identify the timestamp just before the deletion occurred
Configure a point-in-time restore with time_window_end set to that timestamp
Restore to the original or a new cluster

This is a fundamental advantage over mirrored standby clusters, where deletes are faithfully replicated and there is no way to roll back.

How long are backup snapshots retained?

All incremental backup snapshots are retained in your object storage bucket indefinitely by default. You can restore to any previous backup point, not just the latest.

For Kubernetes operator deployments, retention can be enforced per KafkaBackup with opt-in spec.retention. When enabled, the operator prunes complete backup sets after successful backup runs and can keep a minimum number of newest backups:

spec:
  retention:
    enabled: true
    maxAgeDays: 90
    keepLast: 7
    dryRun: true

Storage provider lifecycle policies are also supported and remain the right choice for backend-native retention controls. For example, you can configure S3 Lifecycle Rules to:

Transition older backups to cheaper storage tiers (e.g., S3 Glacier after 90 days)
Automatically delete backups older than a defined retention period (e.g., 1 year)

OSO Kafka Backup does not delete previous snapshots when creating new ones unless operator-managed retention is explicitly enabled. Your backup history grows incrementally by default, and you control how long it is kept.

Can I back up specific topics?

Yes. Use topics.include and topics.exclude with wildcard patterns:

topics:
  include:
    - "orders.*"
    - "payments.*"
    - "inventory.updates"
  exclude:
    - "*.test"
    - "*.staging"

Patterns use glob-style matching. If include is not specified, all topics are backed up. The exclude list takes precedence over include.

How does incremental backup work?

OSO Kafka Backup uses checkpoint-based incremental backups. A local SQLite database tracks the last committed offset for each topic-partition. On each backup run, the tool resumes consuming from the last checkpointed offset, so only new messages are read and stored. This makes subsequent backup runs significantly faster and reduces storage costs.

How to enable it:

Continuous mode (continuous: true): Incremental tracking is automatic — the offset store is created and managed for you.
One-shot or snapshot mode (v0.13.5+): Add the offset_storage section to your config to enable incremental behavior:

offset_storage:
  db_path: /data/offsets.db
  sync_interval_secs: 30

Without offset_storage, one-shot and snapshot backups start from start_offset (default: earliest) on every run, producing a full backup each time. With it, each run resumes from the last checkpoint.

See the Incremental Backups Guide for a step-by-step walkthrough.

What happens if a backup fails mid-run?

The checkpoint mechanism ensures crash resilience. If a backup run fails or is interrupted, the checkpoint database retains the last successfully committed offset for each partition. The next backup run automatically resumes from that point. No data is lost and no duplicate data is written to storage.

How are consumer group offsets backed up?

Consumer group offsets are stored as x-original-offset headers within the backed-up messages. During restore, a three-phase process recovers them:

Restore messages to the target cluster
Plan offset reset using kafka-backup offset-reset plan to compute the mapping between original and new offsets
Execute offset reset using kafka-backup offset-reset execute to commit the mapped offsets to the target cluster's consumer groups

Can I run multiple backup instances simultaneously?

Yes. You can run multiple instances of OSO Kafka Backup concurrently, provided each instance is configured to back up a different set of topics. Use non-overlapping topics.include patterns to partition the workload. Do not configure multiple instances to back up the same topic-partition, as this will result in duplicate data in storage.

How do I verify a backup?

Use the built-in validation command:

kafka-backup validate --deep --config /path/to/config.yaml

The --deep flag performs a full integrity check, verifying that all segments are present, checksums are valid, and the manifest is consistent with the stored data.

What is the maximum supported message size?

The maximum message size is governed by the Kafka cluster's max.message.bytes configuration, which defaults to 1 MB. OSO Kafka Backup has been tested with messages up to 10 MB. If your cluster uses a non-default maximum, ensure the backup tool's consumer configuration matches (via message.max.bytes in the consumer properties).

Restore & Recovery

How do I restore to a specific point in time?

Use the time_window_start and time_window_end parameters in your restore configuration, specified in epoch milliseconds:

restore:
  time_window_start: 1742817600000  # 2026-03-24 12:00:00 UTC
  time_window_end:   1742846400000  # 2026-03-24 20:00:00 UTC
  source:
    storage:
      type: s3
      bucket: my-kafka-backups
  target:
    bootstrap_servers: "target-kafka:9092"

Only messages with timestamps within the specified window will be restored.

How do I convert a date to epoch milliseconds?

Bash:

date -d "2026-03-24 12:00:00 UTC" +%s%3N
# Output: 1742817600000

Python:

from datetime import datetime
int(datetime(2026, 3, 24, 12).timestamp() * 1000)
# Output: 1742817600000

macOS (BSD date):

date -j -u -f "%Y-%m-%d %H:%M:%S" "2026-03-24 12:00:00" +%s000

Can I restore production data into non-production environments?

Yes. You can restore any backup — or a subset of it — to a completely different cluster. This is useful for:

Reproducing production bugs in a staging environment with real data
Populating test environments with representative datasets
Data analysis on a separate cluster without impacting production

Use topic_mapping to restore to different topic names and source_partitions to restore only specific partitions. You can also use time_window_start and time_window_end to restore a specific time slice of data rather than the full history.

restore:
  time_window_start: 1742817600000
  time_window_end:   1742846400000
  topic_mapping:
    "articles.production": "articles.staging"
  source:
    storage:
      type: s3
      bucket: prod-kafka-backups
  target:
    bootstrap_servers: "staging-kafka:9092"

How does OSO Kafka Backup handle large topics with unlimited retention?

OSO Kafka Backup handles large, unbounded topics efficiently thanks to its incremental backup mechanism. Only new messages since the last checkpoint are read and stored on each backup run, regardless of the total topic size.

Real-world characteristics:

Tested with topics exceeding 800 GB and 20+ million messages
Single-partition topics are fully supported, including those where message ordering is critical
Restore time scales with data volume — under optimal conditions, an 800 GB topic restores in approximately 20 minutes, though actual performance depends on storage backend throughput, network bandwidth, and target cluster write capacity

For very large topics, consider:

Using zstd compression to reduce storage footprint (5–7× compression for JSON data)
Deploying the backup tool in the same region and availability zone as your Kafka cluster and storage backend
Monitoring the kafka_backup_consumer_lag metric to ensure backups keep pace with producers

Can I restore to a different cluster?

Yes. Specify the target cluster's bootstrap_servers in your restore configuration. The source and target clusters are completely independent. This is a core use case for disaster recovery -- restoring data to a standby cluster in a different region or cloud provider.

Can I restore to a different topic name?

Yes. Use the topic_mapping configuration to remap topic names during restore:

restore:
  topic_mapping:
    "orders.production": "orders.restored"
    "payments.production": "payments.restored"

How do I recover consumer offsets after a restore?

Use the two-step offset reset workflow:

# Step 1: Generate the offset mapping plan
kafka-backup offset-reset plan \
  --config /path/to/config.yaml \
  --output offset-plan.json

# Step 2: Review and execute the plan
kafka-backup offset-reset execute \
  --plan offset-plan.json \
  --target-bootstrap-servers target-kafka:9092

The plan maps original offsets to the corresponding offsets in the restored topic, accounting for any gaps or reordering.

How long does a restore take?

Restore duration depends on the data volume, storage backend read throughput, network bandwidth, and target cluster write capacity. Under optimal conditions, OSO Kafka Backup achieves approximately 100 MB/s restore throughput. For example, restoring 1 TB of data takes roughly 2.5 to 3 hours.

Can I restore a subset of partitions?

Yes. Use the source_partitions configuration to specify which partitions to restore:

restore:
  source_partitions: [0, 1, 2, 5]

Only the specified partitions will be restored from the backup.

What happens if the target topic already has data?

OSO Kafka Backup appends data to the target topic; it does not overwrite or truncate existing data. If you need a clean restore, create a new topic (or use topic_mapping to restore to a different topic name) to avoid mixing existing and restored data.

Storage

What storage backends are supported?

OSO Kafka Backup supports:

Amazon S3
Azure Blob Storage
Google Cloud Storage (GCS)
S3-compatible storage (MinIO, Ceph, Wasabi, DigitalOcean Spaces)
Local filesystem (for testing and development)

How much storage will my backups consume?

Estimate storage as:

storage_required = raw_data_size / compression_ratio

Compression ratios vary by data type:

Data Type	Compression (zstd)	Example
JSON	5:1 to 7:1	1 TB raw ≈ 200-300 GB compressed
Avro	2:1 to 3:1	1 TB raw ≈ 350-500 GB compressed
Protobuf	2:1 to 3:1	1 TB raw ≈ 350-500 GB compressed
Already compressed	~1:1	No significant reduction

What is the backup storage format?

Backups are organized as follows:

backup-root/
├── manifest.json
├── state/
│   └── offsets.db
└── topics/
    └── {topic-name}/
        └── partition={id}/
            ├── segment-000000000000.zst
            ├── segment-000000001000.zst
            └── ...

manifest.json -- Metadata about the backup (topics, partitions, offsets, timestamps)
state/offsets.db -- SQLite checkpoint database tracking committed offsets
topics/{topic}/partition={id}/segment-NNNN.zst -- Compressed data segments

Can I access backup data without restoring?

Yes. Use the describe command to inspect backup metadata:

kafka-backup describe --config /path/to/config.yaml

You can also directly access objects in S3 (or other storage) using standard tools such as the AWS CLI, gsutil, or az storage blob. Segment files are compressed with the configured algorithm (e.g., zstd) and contain Kafka records in a binary format.

Can I migrate backups between storage backends?

Yes. Since backups are stored as standard objects, you can copy them between backends using tools like aws s3 sync, gsutil rsync, azcopy, or rclone. After copying, update your restore configuration to point to the new storage location.

Does OSO Kafka Backup work with S3-compatible storage?

Yes. Configure the endpoint URL to point to your S3-compatible service:

storage:
  type: s3
  bucket: my-backups
  region: us-east-1
  endpoint: "https://minio.internal:9000"
  force_path_style: true

This works with MinIO, Ceph Object Gateway, Wasabi, DigitalOcean Spaces, and other S3-compatible services.

Performance

What throughput can I expect?

Under optimal conditions, OSO Kafka Backup achieves 100+ MB/s for both backup and restore operations. Actual throughput depends on:

Network bandwidth between Kafka, the backup tool, and storage
Storage backend write/read latency
Compression algorithm and level
Message size (larger messages achieve higher throughput)
Number of partitions being processed concurrently

How much memory does OSO Kafka Backup use?

Typical memory usage is under 500 MB when processing 4 partitions concurrently. Memory consumption scales with the number of concurrent partitions and the configured segment size. For high-concurrency workloads, monitor RSS via the process_resident_memory_bytes Prometheus metric and adjust segment_max_bytes or concurrency settings accordingly.

How do I tune for maximum throughput?

Refer to PE-01: Throughput Optimisation in the Performance Efficiency pillar. Key tuning parameters:

Segment size: Increase segment_max_bytes to reduce the number of storage write operations
Fetch size: Increase fetch.max.bytes and max.partition.fetch.bytes in the consumer config
Compression level: Use a lower zstd compression level (e.g., 1-3) for faster compression at the cost of slightly larger files
Co-location: Deploy the backup tool in the same region and availability zone as the Kafka cluster and storage backend

What impact does backup have on the Kafka cluster?

Minimal. OSO Kafka Backup operates as a standard Kafka consumer. It does not require any broker restarts, plugins, or configuration changes. The impact is equivalent to adding another consumer to the cluster. For latency-sensitive workloads, consider configuring a dedicated consumer group and using rack-aware replica fetching.

How do I benchmark performance?

The kafka-backup-demos repository includes a benchmark suite that generates synthetic workloads and measures backup/restore throughput under various configurations. Use it to establish baselines for your environment before deploying to production.

Security & Compliance

Is backup data encrypted?

Server-side encryption: All major cloud storage providers offer server-side encryption (SSE-S3, SSE-KMS, Azure Storage Service Encryption, GCS default encryption). Enable this on your storage bucket for encryption at rest.

Client-side encryption (Enterprise): The Enterprise edition supports client-side AES-256 encryption, where data is encrypted before it leaves the backup tool. This ensures data is encrypted in transit to storage and at rest, regardless of the storage provider's encryption settings.

How do I configure TLS or mTLS?

Set the security protocol and certificate paths in your configuration:

kafka:
  bootstrap_servers: "kafka:9093"
  security_protocol: "SSL"  # or "SASL_SSL" for SASL + TLS
  ssl_ca_location: "/certs/ca.pem"
  ssl_certificate_location: "/certs/client.pem"
  ssl_key_location: "/certs/client-key.pem"

For mTLS, provide both the client certificate and key. The CA certificate is used to verify the broker's identity.

What SASL authentication mechanisms are supported?

OSO Kafka Backup supports the following SASL mechanisms:

PLAIN -- Username and password (use with TLS)
SCRAM-SHA-256 -- Salted Challenge Response Authentication
SCRAM-SHA-512 -- Salted Challenge Response Authentication (stronger hash)

kafka:
  security_protocol: "SASL_SSL"
  sasl_mechanism: "SCRAM-SHA-512"
  sasl_username: "backup-user"
  sasl_password: "${KAFKA_SASL_PASSWORD}"

The Enterprise edition provides GDPR compliance tools:

Data masking: Redact or mask personally identifiable information (PII) during backup
Right to be forgotten: Delete specific records from backups by key
Field-level redaction: Selectively redact fields within messages while preserving the rest of the record

These features enable compliance with data protection regulations without sacrificing backup completeness.

How do I restrict who can perform restore operations?

Multiple layers of access control are available:

Enterprise RBAC: Define roles (backup-operator, restore-operator, admin) with fine-grained permissions
IAM policies: Restrict access to storage buckets using AWS IAM, Azure RBAC, or GCP IAM
Kubernetes RBAC: Limit which service accounts can create KafkaRestore custom resources

Kubernetes & Deployment

How do I deploy on Kubernetes?

Install the Kafka Backup Operator via Helm:

helm repo add oso https://charts.oso.sh
helm repo update
helm install kafka-backup-operator oso/kafka-backup-operator \
  --namespace kafka-backup \
  --create-namespace

Then create backup and restore resources using CRDs:

apiVersion: kafkabackup.oso.sh/v1alpha1
kind: KafkaBackup
metadata:
  name: production-backup
spec:
  configRef:
    name: backup-config

What Kubernetes versions are supported?

OSO Kafka Backup Operator requires Kubernetes 1.24 or later. It is tested against the latest three minor versions of Kubernetes.

Can I deploy separate backup operators per team or product line?

Yes. You can deploy the Kafka Backup Operator into multiple Kubernetes namespaces, with each instance managing backups for a specific team, product line, or environment. This provides:

Blast-radius isolation — a misconfiguration in one namespace does not affect others
Fine-grained IAM — bind each operator to a dedicated IAM role with access to only its S3 bucket or prefix
Independent lifecycle management — each team can manage their own backup schedules and retention policies

# Team A — news platform backups
apiVersion: kafkabackup.oso.sh/v1alpha1
kind: KafkaBackup
metadata:
  name: news-backup
  namespace: team-news
spec:
  configRef:
    name: news-backup-config
---
# Team B — streaming platform backups
apiVersion: kafkabackup.oso.sh/v1alpha1
kind: KafkaBackup
metadata:
  name: streaming-backup
  namespace: team-streaming
spec:
  configRef:
    name: streaming-backup-config

Alternatively, a single operator instance can manage multiple backup configurations in one namespace if you prefer centralised management.

Can I use ArgoCD or Flux for GitOps deployments?

Yes. Store your KafkaBackup and KafkaRestore manifests in a Git repository. Configure an ArgoCD Application or Flux Kustomization pointing to the manifests directory:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: kafka-backup
spec:
  source:
    repoURL: https://github.com/myorg/k8s-manifests
    path: kafka-backup/
    targetRevision: main
  destination:
    server: https://kubernetes.default.svc
    namespace: kafka-backup

Can I deploy outside of Kubernetes?

Yes. OSO Kafka Backup ships as a standalone static binary that runs on bare metal, virtual machines, and Docker containers. No Kubernetes or container orchestration is required:

# Download the binary
curl -LO https://github.com/osodevops/kafka-backup/releases/latest/download/kafka-backup-linux-amd64

# Run directly
./kafka-backup-linux-amd64 backup --config /etc/kafka-backup/config.yaml

How do I monitor OSO Kafka Backup in Kubernetes?

The operator exposes Prometheus metrics on port 8080. Create a ServiceMonitor to scrape them:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: kafka-backup-metrics
spec:
  selector:
    matchLabels:
      app: kafka-backup
  endpoints:
    - port: metrics
      interval: 15s

Pair this with the provided Grafana dashboards from the kafka-backup-demos repository for comprehensive visibility.

Enterprise

What features are included in the Enterprise edition?

The Enterprise edition extends the OSS version with:

AES-256 client-side encryption for backup data
Role-based access control (RBAC) for backup and restore operations
Audit logging for all operations with tamper-proof log storage
GDPR compliance tools including data masking, right to be forgotten, and field-level redaction
Schema Registry backup and restore for Avro, Protobuf, and JSON Schema
Priority support with defined SLAs

How do I get an Enterprise licence?

Contact the OSO sales team at oso.sh to discuss your requirements and obtain a licence key.

Is there a trial available?

Yes. A 30-day evaluation licence is available that provides full access to all Enterprise features. Contact the sales team to request a trial.

What support is included with Enterprise?

Enterprise support includes:

Critical issues (P1): 24/7 response with a 1-hour initial response time
Standard issues (P2-P4): Business hours support with response times based on severity
Dedicated Slack channel for direct communication with the engineering team
Quarterly architecture reviews to ensure your deployment follows best practices

Troubleshooting

How do I enable debug logging?

Use the -v flag for debug-level logging or -vv for trace-level:

# Debug logging
kafka-backup -v backup --config /path/to/config.yaml

# Trace logging (very verbose)
kafka-backup -vv backup --config /path/to/config.yaml

Alternatively, set the RUST_LOG environment variable:

RUST_LOG=debug kafka-backup backup --config /path/to/config.yaml

My backup is running slowly. How do I diagnose this?

Check the following, in order:

Network latency: Measure latency between the backup tool and both the Kafka cluster and storage backend
Storage write latency: Monitor the kafka_backup_storage_write_duration_seconds Prometheus metric
Compression overhead: Try a faster compression level or algorithm (e.g., lz4 instead of zstd)
Resource utilisation: Check CPU and memory usage on the host running the backup
Consumer lag: Monitor kafka_backup_consumer_lag to see if the tool is keeping up with producers

I am getting a connection error. What should I check?

Verify the following:

Bootstrap servers: Ensure the bootstrap_servers address is correct and resolvable
TLS certificates: Verify certificates are valid, not expired, and the CA chain is complete
Network connectivity: Confirm the backup tool can reach the Kafka brokers on the configured port (e.g., telnet kafka-broker 9093)
Firewall rules: Check that security groups, NACLs, or firewall rules allow traffic on the Kafka port
Kafka ACLs: Ensure the backup user has READ and DESCRIBE permissions on the target topics and consumer group

My restore is failing. How do I troubleshoot?

Follow these steps:

Validate the backup first: kafka-backup validate --deep --config /path/to/config.yaml
Check target connectivity: Verify the restore tool can reach the target Kafka cluster
Verify IAM/storage permissions: Ensure the restore process has read access to the backup storage location
Check disk space: Ensure sufficient local disk space for temporary decompression buffers
Review error logs: Enable debug logging (-v) and check for specific error messages

How do I report a bug or get community support?

Bug reports: Open an issue on GitHub at github.com/osodevops/kafka-backup/issues
Community support: Start a discussion at GitHub Discussions
Enterprise support: Use your dedicated Slack channel or contact the support team directly

General​

What is OSO Kafka Backup?​

How does OSO Kafka Backup differ from MirrorMaker 2?​

Why should I use OSO Kafka Backup instead of running a mirrored standby cluster?​

How does OSO Kafka Backup compare to Confluent Replicator?​

Is OSO Kafka Backup production-ready?​

What Kafka versions are supported?​

What managed Kafka services are supported?​

Can I start with open source and upgrade to Enterprise later?​

What is the difference between the OSS and Enterprise editions?​

Backup Operations​

How do I schedule automated backups?​

Do I need continuous backup, or are scheduled (cron) backups sufficient?​

How does OSO Kafka Backup protect against accidental deletions?​

How long are backup snapshots retained?​

Can I back up specific topics?​

How does incremental backup work?​

What happens if a backup fails mid-run?​

How are consumer group offsets backed up?​

Can I run multiple backup instances simultaneously?​

How do I verify a backup?​

What is the maximum supported message size?​

Restore & Recovery​

How do I restore to a specific point in time?​

How do I convert a date to epoch milliseconds?​

Can I restore production data into non-production environments?​

How does OSO Kafka Backup handle large topics with unlimited retention?​

Can I restore to a different cluster?​

Can I restore to a different topic name?​

How do I recover consumer offsets after a restore?​

How long does a restore take?​

Can I restore a subset of partitions?​

What happens if the target topic already has data?​

Storage​

What storage backends are supported?​

How much storage will my backups consume?​

What is the backup storage format?​

Can I access backup data without restoring?​

Can I migrate backups between storage backends?​

Does OSO Kafka Backup work with S3-compatible storage?​

Performance​

What throughput can I expect?​

How much memory does OSO Kafka Backup use?​

How do I tune for maximum throughput?​

What impact does backup have on the Kafka cluster?​

How do I benchmark performance?​

Security & Compliance​

Is backup data encrypted?​

How do I configure TLS or mTLS?​

What SASL authentication mechanisms are supported?​

How does OSO Kafka Backup support GDPR compliance?​

How do I restrict who can perform restore operations?​

Kubernetes & Deployment​

How do I deploy on Kubernetes?​

What Kubernetes versions are supported?​

Can I deploy separate backup operators per team or product line?​

Can I use ArgoCD or Flux for GitOps deployments?​

Can I deploy outside of Kubernetes?​

How do I monitor OSO Kafka Backup in Kubernetes?​

Enterprise​

What features are included in the Enterprise edition?​

How do I get an Enterprise licence?​

Is there a trial available?​

What support is included with Enterprise?​

Troubleshooting​

How do I enable debug logging?​

My backup is running slowly. How do I diagnose this?​

I am getting a connection error. What should I check?​

My restore is failing. How do I troubleshoot?​

How do I report a bug or get community support?​