Helm Chart
Deploy OSO Kafka Backup Enterprise on Kubernetes as a scheduled CronJob using the official Helm chart. The chart packages everything needed for production Kafka backups — CronJob scheduling, license injection, credential management, metrics, and cloud IAM integration.
The enterprise image includes a 14-day free trial of all enterprise features — no signup, no license file needed. Install the chart, point it at your Kafka cluster, and it just works.
Prerequisites
- Kubernetes 1.26+
- Helm 3.8+ (OCI registry support)
- kubectl configured to access your cluster
- Access to your Kafka cluster from within the Kubernetes cluster
- A storage backend (S3, Azure Blob, GCS, or a PersistentVolume)
Install
The chart is published as an OCI artifact to GitHub Container Registry.
# Create namespace
kubectl create namespace kafka-backup
# Install with default values
helm install kafka-backup \
oci://ghcr.io/osodevops/charts/kafka-backup-enterprise \
--namespace kafka-backup
Verify Installation
# Check CronJob was created
kubectl get cronjob -n kafka-backup
# Expected output:
# NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE
# kafka-backup-enterprise 0 2 * * * False 0 <none>
# Trigger a manual backup to test
kubectl create job --from=cronjob/kafka-backup-kafka-backup-enterprise \
kafka-backup-manual-test -n kafka-backup
# Watch the pod
kubectl get pods -n kafka-backup -w
# Check logs
kubectl logs -n kafka-backup -l job-name=kafka-backup-manual-test -f
Configuration
The chart uses a two-layer configuration approach:
values.yaml— controls Kubernetes resources (scheduling, secrets, resources, monitoring)config.backupConfig— the rawkafka-backupYAML config, rendered into a ConfigMap
Credentials in the backup config use ${ENV_VAR} placeholders. The binary expands them at runtime from environment variables injected by Kubernetes Secrets.
Minimal Example
cronjob:
schedule: "0 2 * * *" # Daily at 2 AM
config:
backupConfig: |
mode: backup
backup_id: "daily-backup"
source:
bootstrap_servers:
- kafka-broker-0.kafka:9092
- kafka-broker-1.kafka:9092
storage:
backend: s3
bucket: my-kafka-backups
region: eu-west-1
serviceAccount:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789:role/kafka-backup
helm install kafka-backup \
oci://ghcr.io/osodevops/charts/kafka-backup-enterprise \
--namespace kafka-backup \
--values values.yaml
Full Enterprise Example
This example enables Schema Registry backup, Confluent RBAC backup, and Prometheus metrics:
cronjob:
schedule: "0 */6 * * *" # Every 6 hours
concurrencyPolicy: Forbid
activeDeadlineSeconds: 3600
image:
tag: "0.3.1"
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: "1"
memory: 1Gi
license:
existingSecret: kafka-backup-license
credentials:
existingSecret: kafka-backup-credentials
config:
backupConfig: |
mode: backup
backup_id: "enterprise-backup"
source:
bootstrap_servers:
- kafka:9092
storage:
backend: s3
bucket: kafka-backups
region: us-east-1
enterprise:
schema_registry:
url: "https://schema-registry:8081"
auth:
type: basic
username: ${SR_USER}
password: ${SR_PASS}
backup:
subjects: ["*"]
exclude: ["_*"]
confluent_rbac:
mds_url: "https://mds:8090"
auth:
username: ${MDS_USER}
password: ${MDS_PASS}
backup:
principals: ["*"]
metrics:
enabled: true
serviceMonitor:
enabled: true
labels:
release: prometheus
Notice how the backup config uses ${SR_USER}, ${SR_PASS}, etc. These are not Helm template variables — they are literal ${ENV_VAR} placeholders that the kafka-backup binary expands at runtime. The actual values come from the Kubernetes Secret referenced by credentials.existingSecret.
License
Enterprise features are gated by an Ed25519-signed license file, validated entirely offline — no license server, no network calls. Without a license, the binary operates in one of three modes:
| Mode | When | Enterprise Features |
|---|---|---|
| Auto-trial | First 14 days, no signup needed | All features enabled |
| OSS mode | After trial expires, no license | Disabled (warns and skips) |
| Licensed | Valid .lic or .key file provided | Per-license features enabled |
Enterprise features never block your backup. If Schema Registry backup is configured but not licensed, the binary logs a warning, skips that feature, and completes the Kafka data backup normally with exit code 0.
How the Binary Discovers the License
On startup, the kafka-backup binary checks these sources in order:
| Priority | Source | How the Helm chart uses it |
|---|---|---|
| 1 | ENTERPRISE_LICENSE_KEY env var | Default — chart injects this from a Kubernetes Secret |
| 2 | ENTERPRISE_LICENSE_FILE env var | Can be set via extraEnv if mounting a file |
| 3 | /etc/kafka-backup/license.key or .lic | Mount via extraVolumes |
| 4 | ~/.config/kafka-backup/license.lic | Not typical in containers |
| 5 | Auto-trial (14 days) | Fallback when no license is found |
The Helm chart uses priority 1 by default — it sets the ENTERPRISE_LICENSE_KEY environment variable from a Kubernetes Secret. The value must be the base64-encoded content of your license file (.lic or .key).
License File Formats
The binary accepts two formats, auto-detected by the PEM header:
Keygen.sh .lic format (standard — issued via enterprise.kafkabackup.com):
-----BEGIN LICENSE FILE-----
eyJlbmMiOiJleUowZVhBaU9pSktWMVFp...
-----END LICENSE FILE-----
Custom PEM .key format (purchased via enterprise.kafkabackup.com):
-----BEGIN KAFKA-BACKUP LICENSE-----
eyJwYXlsb2FkIjp7ImxpY2Vuc2VfaWQi...
-----END KAFKA-BACKUP LICENSE-----
Both formats are verified using Ed25519 signatures with the public key embedded in the binary at compile time. The binary auto-detects the format from the PEM header. See Licensing for full details on the license payload, feature flags, and security model.
Injecting the License via Helm
Using an Existing Secret (recommended)
Create the Secret, then reference it in your values. This works with both .lic and .key files:
# Create the license secret (works with either license.lic or license.key)
kubectl create secret generic kafka-backup-license \
--from-literal=license-b64="$(base64 < license.key)" \
--namespace kafka-backup
license:
existingSecret: kafka-backup-license
existingSecretKey: license-b64 # default key name
The chart sets ENTERPRISE_LICENSE_KEY on the pod from this Secret. At startup, the binary base64-decodes the value, detects the PEM format, verifies the Ed25519 signature, and extracts the licensed features.
For production, consider using External Secrets Operator or Sealed Secrets to manage the license Secret. The chart works with any Secret — it just references the name:
license:
existingSecret: my-externalsecret-license
Inline License Key (dev/test only)
For quick testing, pass the base64-encoded license directly:
helm install kafka-backup \
oci://ghcr.io/osodevops/charts/kafka-backup-enterprise \
--namespace kafka-backup \
--set license.key="$(base64 < license.key)"
Never commit license keys to version control. Use existingSecret with a Secret created out-of-band.
Mounting as a File (alternative)
If you prefer file-based license discovery (priority 3), mount the license file directly:
extraVolumes:
- name: license
secret:
secretName: kafka-backup-license-file
extraVolumeMounts:
- name: license
mountPath: /etc/kafka-backup
readOnly: true
# Create the secret containing the raw license file
kubectl create secret generic kafka-backup-license-file \
--from-file=license.key=/path/to/your/license.key \
--namespace kafka-backup
With this approach, don't set license.existingSecret or license.key — the binary will find the file at /etc/kafka-backup/license.key automatically.
Auto-Trial in Kubernetes
When no license is configured, the binary activates a 14-day auto-trial with all enterprise features enabled. The trial state is tracked in a file at /var/lib/kafka-backup/.trial.
In a CronJob context, each pod starts fresh with an empty filesystem. This means:
- Without
trialPersistence: each pod creates a new trial state file, so the trial effectively never expires across CronJob runs - With
trialPersistence.enabled: true: a PVC persists the trial state across runs, so the 14-day countdown is accurate
For production, always use a real license. To disable the auto-trial entirely:
extraEnv:
- name: KAFKA_BACKUP_NO_TRIAL
value: "1"
Verifying the License
Helm Test
Run the built-in Helm test to verify the license was injected correctly:
helm test kafka-backup -n kafka-backup
This creates a pod that runs kafka-backup license info and reports the license status.
Manual Check
# Exec into a running backup pod (during a CronJob run)
kubectl exec -it -n kafka-backup <pod-name> -- kafka-backup license info
Expected output when licensed:
License ID: abcdef12-3456-7890-abcd-ef1234567890
Customer: Acme Corp (acme@example.com)
Tier: Enterprise
Features: encryption, schema_registry, rbac, audit, support
Expires: 2027-01-15 (285 days remaining)
Status: Valid
CLI License Commands
The kafka-backup binary provides three license management commands:
| Command | Description |
|---|---|
kafka-backup license info | Show current license status (what the binary is using) |
kafka-backup license verify --file license.lic | Verify a license file without applying it |
kafka-backup license apply --file license.lic | Validate and save to ~/.config/kafka-backup/ |
In Kubernetes, license info is the most useful — it shows exactly what license the pod detected at startup. The apply command saves to the user config directory, which isn't persistent in containers — use Secrets instead.
See Licensing for the full licensing guide including obtaining licenses, license tiers, offline validation, and FAQ.
Credentials
Service credentials (Kafka SASL, Schema Registry auth, MDS auth, S3 keys) are injected as environment variables. The backup config YAML uses ${ENV_VAR} placeholders, and the binary expands them at runtime.
Using an Existing Secret (recommended)
Create a Secret with all credential environment variables:
kubectl create secret generic kafka-backup-credentials \
--from-literal=SR_USER=sr-admin \
--from-literal=SR_PASS=secret123 \
--from-literal=MDS_USER=mds-admin \
--from-literal=MDS_PASS=secret456 \
--from-literal=AWS_ACCESS_KEY_ID=AKIA... \
--from-literal=AWS_SECRET_ACCESS_KEY=wJal... \
--namespace kafka-backup
credentials:
existingSecret: kafka-backup-credentials
All keys in the Secret are injected as environment variables into the pod via envFrom.
Inline Credentials (dev/test only)
credentials:
inline:
SR_USER: sr-admin
SR_PASS: secret123
MDS_USER: mds-admin
MDS_PASS: secret456
Inline credentials are stored in a Kubernetes Secret created by the chart, but the values are visible in your values.yaml file. Use existingSecret for production.
Cloud IAM (no static credentials)
For S3, Azure Blob, or GCS storage, use your cloud provider's workload identity instead of static credentials. See Cloud Provider Setup below.
Scheduling
CronJob (default)
The chart creates a Kubernetes CronJob that runs kafka-backup backup on a schedule:
cronjob:
enabled: true
schedule: "0 2 * * *" # Daily at 2 AM UTC
timeZone: "Europe/London" # Kubernetes 1.27+
concurrencyPolicy: Forbid # Never overlap
backoffLimit: 2 # Retry twice on failure
activeDeadlineSeconds: 7200 # Kill after 2 hours
ttlSecondsAfterFinished: 86400 # Clean up after 24 hours
| Value | Default | Description |
|---|---|---|
cronjob.schedule | 0 2 * * * | Cron schedule expression |
cronjob.timeZone | "" | IANA timezone (requires K8s 1.27+) |
cronjob.concurrencyPolicy | Forbid | Prevent overlapping backup runs |
cronjob.backoffLimit | 2 | Number of retries on failure |
cronjob.activeDeadlineSeconds | 7200 | Maximum runtime before kill |
cronjob.suspend | false | Pause without deleting |
Triggering a Manual Backup
kubectl create job --from=cronjob/kafka-backup-kafka-backup-enterprise \
kafka-backup-manual-$(date +%s) -n kafka-backup
Suspending Backups
Pause scheduled backups without deleting the CronJob:
kubectl patch cronjob kafka-backup-kafka-backup-enterprise \
-n kafka-backup \
-p '{"spec":{"suspend":true}}'
Or via Helm:
helm upgrade kafka-backup \
oci://ghcr.io/osodevops/charts/kafka-backup-enterprise \
--namespace kafka-backup \
--set cronjob.suspend=true
Ad-hoc Job (Backup or Restore)
For one-shot operations like restores, enable the Job workload instead of the CronJob:
Restore
helm install kafka-restore \
oci://ghcr.io/osodevops/charts/kafka-backup-enterprise \
--namespace kafka-backup \
--set cronjob.enabled=false \
--set job.enabled=true \
--set job.command=restore \
--values restore-values.yaml
cronjob:
enabled: false
job:
enabled: true
command: restore
backoffLimit: 0
activeDeadlineSeconds: 7200
config:
backupConfig: |
mode: restore
backup_id: "daily-backup"
target:
bootstrap_servers:
- kafka:9092
storage:
backend: s3
bucket: kafka-backups
region: us-east-1
# Watch the restore
kubectl logs -n kafka-backup -l job-name=kafka-restore-kafka-backup-enterprise-job -f
Schema-Only Backup
helm install schema-backup \
oci://ghcr.io/osodevops/charts/kafka-backup-enterprise \
--namespace kafka-backup \
--set cronjob.enabled=false \
--set job.enabled=true \
--set "job.extraArgs={--schema-only}" \
--values values.yaml
Jobs auto-delete after ttlSecondsAfterFinished (default: 24 hours). To clean up immediately:
helm uninstall kafka-restore -n kafka-backup
Cloud Provider Setup
Use workload identity to grant the backup pod access to cloud storage without static credentials.
AWS (EKS with IRSA)
# Create IAM policy for S3 access
cat > /tmp/kafka-backup-policy.json <<'EOF'
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:ListBucket",
"s3:DeleteObject"
],
"Resource": [
"arn:aws:s3:::kafka-backups",
"arn:aws:s3:::kafka-backups/*"
]
}
]
}
EOF
aws iam create-policy \
--policy-name KafkaBackupPolicy \
--policy-document file:///tmp/kafka-backup-policy.json
# Create IAM service account with IRSA
eksctl create iamserviceaccount \
--name kafka-backup-enterprise \
--namespace kafka-backup \
--cluster my-cluster \
--attach-policy-arn arn:aws:iam::123456789:policy/KafkaBackupPolicy \
--approve
serviceAccount:
create: false
name: kafka-backup-enterprise
# Service account created by eksctl with IRSA annotation
Azure (AKS with Workload Identity)
# Create managed identity
az identity create \
--name kafka-backup-identity \
--resource-group myResourceGroup
CLIENT_ID=$(az identity show \
--name kafka-backup-identity \
--resource-group myResourceGroup \
--query clientId -o tsv)
# Grant Storage Blob Data Contributor role
az role assignment create \
--assignee $CLIENT_ID \
--role "Storage Blob Data Contributor" \
--scope /subscriptions/<sub>/resourceGroups/<rg>/providers/Microsoft.Storage/storageAccounts/<account>
# Create federated credential
az identity federated-credential create \
--name kafka-backup-federated \
--identity-name kafka-backup-identity \
--resource-group myResourceGroup \
--issuer $(az aks show --name myAKSCluster --resource-group myResourceGroup --query oidcIssuerProfile.issuerUrl -o tsv) \
--subject system:serviceaccount:kafka-backup:kafka-backup-enterprise \
--audience api://AzureADTokenExchange
serviceAccount:
create: true
annotations:
azure.workload.identity/client-id: "<managed-identity-client-id>"
podLabels:
azure.workload.identity/use: "true"
GCP (GKE with Workload Identity)
# Create GCP service account
gcloud iam service-accounts create kafka-backup-sa
# Grant Storage Object Admin
gcloud projects add-iam-policy-binding PROJECT_ID \
--member "serviceAccount:kafka-backup-sa@PROJECT_ID.iam.gserviceaccount.com" \
--role "roles/storage.objectAdmin"
# Bind to Kubernetes service account
gcloud iam service-accounts add-iam-policy-binding \
kafka-backup-sa@PROJECT_ID.iam.gserviceaccount.com \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:PROJECT_ID.svc.id.goog[kafka-backup/kafka-backup-enterprise]"
serviceAccount:
create: true
annotations:
iam.gke.io/gcp-service-account: kafka-backup-sa@PROJECT_ID.iam.gserviceaccount.com
Monitoring
Prometheus Pod Annotations
When metrics.enabled is true, the chart adds Prometheus scrape annotations to the CronJob pods:
metrics:
enabled: true
config:
backupConfig: |
mode: backup
# ... your config ...
metrics:
enabled: true
port: 8080
path: /metrics
Prometheus pod annotations require your Prometheus instance to be configured for pod service discovery. CronJob pods are short-lived — metrics are only scrapeable while a backup is running.
ServiceMonitor (Prometheus Operator)
For clusters using the Prometheus Operator:
metrics:
enabled: true
serviceMonitor:
enabled: true
interval: 15s
labels:
release: prometheus # Must match your Prometheus selector
Available Metrics
The kafka-backup binary exposes Prometheus metrics during execution:
| Metric | Type | Description |
|---|---|---|
kafka_backup_records_processed | Counter | Total records backed up |
kafka_backup_bytes_processed | Counter | Total bytes backed up |
kafka_backup_segments_completed | Counter | Completed segment files |
kafka_backup_duration_seconds | Histogram | Backup duration |
kafka_backup_errors_total | Counter | Errors during backup |
Security
Pod Security
The chart enforces a hardened security context by default:
# Defaults — no changes needed
podSecurityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: [ALL]
The container runs as non-root (UID 1000) with a read-only filesystem. Writable paths:
/tmp— emptyDir for temporary files/var/lib/kafka-backup— optional PVC for trial state persistence
Kafka mTLS
To connect to a Kafka cluster using mTLS, mount your certificates via extraVolumes:
extraVolumes:
- name: kafka-certs
secret:
secretName: kafka-client-certs
extraVolumeMounts:
- name: kafka-certs
mountPath: /certs/kafka
readOnly: true
config:
backupConfig: |
mode: backup
source:
bootstrap_servers: ["kafka:9093"]
security_protocol: SSL
ssl:
ca_location: /certs/kafka/ca.crt
certificate_location: /certs/kafka/client.crt
key_location: /certs/kafka/client.key
storage:
backend: s3
bucket: kafka-backups
Upgrade
helm upgrade kafka-backup \
oci://ghcr.io/osodevops/charts/kafka-backup-enterprise \
--namespace kafka-backup \
--values values.yaml
To upgrade to a specific version:
helm upgrade kafka-backup \
oci://ghcr.io/osodevops/charts/kafka-backup-enterprise \
--namespace kafka-backup \
--version 0.2.0
Uninstall
helm uninstall kafka-backup --namespace kafka-backup
If trialPersistence.enabled was set, the PVC is not deleted on uninstall (Helm default). Delete it manually if no longer needed:
kubectl delete pvc kafka-backup-kafka-backup-enterprise-trial -n kafka-backup
Troubleshooting
Backup Pod Not Starting
# Check CronJob status
kubectl get cronjob -n kafka-backup
# Check recent jobs
kubectl get jobs -n kafka-backup --sort-by='.metadata.creationTimestamp'
# Check events
kubectl get events -n kafka-backup --sort-by='.lastTimestamp' | tail -20
Backup Failing
# Get logs from the most recent job
kubectl logs -n kafka-backup -l app.kubernetes.io/instance=kafka-backup --tail=200
# Or find the specific pod
kubectl get pods -n kafka-backup
kubectl logs -n kafka-backup <pod-name>
Permission Denied (S3/Cloud Storage)
# Verify service account annotations
kubectl get sa -n kafka-backup -o yaml
# Test IAM from within the pod (AWS)
kubectl run aws-test --rm -it --image=amazon/aws-cli \
--overrides='{"spec":{"serviceAccountName":"kafka-backup-enterprise"}}' \
-n kafka-backup -- sts get-caller-identity
License Not Detected
# Run the Helm test
helm test kafka-backup -n kafka-backup
# Or manually check
kubectl run license-check --rm -it \
--image=osodevops/kafka-backup-enterprise:0.3.1 \
--env="ENTERPRISE_LICENSE_KEY=$(kubectl get secret kafka-backup-license -n kafka-backup -o jsonpath='{.data.license-b64}' | base64 -d)" \
-n kafka-backup -- license info
CronJob Never Fires
# Check if suspended
kubectl get cronjob -n kafka-backup -o jsonpath='{.items[0].spec.suspend}'
# Check startingDeadlineSeconds — if the controller was down when the job
# was due, it may have been skipped
kubectl describe cronjob -n kafka-backup kafka-backup-kafka-backup-enterprise
Values Reference
Quick Reference
| Value | Default | Description |
|---|---|---|
image.repository | osodevops/kafka-backup-enterprise | Docker image |
image.tag | Chart appVersion | Image tag |
cronjob.enabled | true | Create a CronJob |
cronjob.schedule | 0 2 * * * | Cron schedule |
cronjob.concurrencyPolicy | Forbid | Overlap prevention |
cronjob.activeDeadlineSeconds | 7200 | Maximum runtime |
job.enabled | false | Create a one-shot Job |
job.command | backup | backup or restore |
license.existingSecret | "" | Secret with license key |
license.key | "" | Inline license (dev only) |
config.backupConfig | (minimal example) | Raw kafka-backup YAML |
config.existingConfigMap | "" | Use pre-existing ConfigMap |
credentials.existingSecret | "" | Secret with credential env vars |
credentials.inline | {} | Inline credentials (dev only) |
metrics.enabled | false | Enable Prometheus metrics |
metrics.serviceMonitor.enabled | false | Create ServiceMonitor |
trialPersistence.enabled | false | PVC for trial state file |
serviceAccount.create | true | Create ServiceAccount |
serviceAccount.annotations | {} | SA annotations (IRSA, WI) |
For the complete values.yaml with all options, see the chart source.
Next Steps
- Licensing — obtain and manage your license
- Schema Registry Backup — configure Schema Registry backup
- Confluent RBAC Backup — back up MDS role bindings
- Monitoring Setup — configure alerts and dashboards