Self-Assessment Checklist
Use this checklist to evaluate the maturity of your Kafka backup architecture across all six pillars. Score each item honestly — the goal is to identify improvement areas, not to achieve a perfect score on day one.
How to Score
Rate each item on a 0–3 scale:
| Score | Level | Description |
|---|---|---|
| 0 | Not implemented | No action taken |
| 1 | Basic | Partially implemented, manual processes |
| 2 | Advanced | Fully implemented, mostly automated |
| 3 | Expert | Fully automated, continuously improved, measured |
Scoring Thresholds
| Total Score | Maturity | Action |
|---|---|---|
| 0–25 | Critical gaps | Address immediately — your backup infrastructure has significant risk |
| 26–50 | Developing | Create a prioritised improvement plan targeting the lowest-scoring pillars |
| 51–70 | Mature | Focus on optimisation and automation of remaining manual processes |
| 71–87 | Well-Architected | Maintain through continuous improvement and regular reassessment |
Re-run this assessment quarterly, or after significant changes to your Kafka environment (new topics, increased throughput, new compliance requirements). Track your score over time to measure improvement.
Operational Excellence
| # | Check | Score (0–3) |
|---|---|---|
| 1 | Backup operations have a designated owner with clear escalation paths | |
| 2 | Backup schedules are fully automated (no manual runs required) | |
| 3 | Monitoring and alerting covers all key backup metrics (lag, throughput, errors, checkpoint age) | |
| 4 | DR runbooks exist with exact kafka-backup CLI commands and have been tested | |
| 5 | All backup configuration is version-controlled and deployed via GitOps or CI/CD |
Pillar subtotal: ___ / 15
Security
| # | Check | Score (0–3) |
|---|---|---|
| 6 | Least-privilege IAM policies are enforced for backup and restore processes separately | |
| 7 | All backup data is encrypted at rest (SSE or client-side encryption) | |
| 8 | All connections are encrypted in transit (TLS 1.2+ for Kafka, HTTPS for storage) | |
| 9 | No hardcoded credentials — all secrets managed via a secrets manager or environment variables | |
| 10 | Audit logging is enabled for all backup and restore operations |
Pillar subtotal: ___ / 15
Reliability
| # | Check | Score (0–3) |
|---|---|---|
| 11 | Backup integrity is validated automatically after every run (kafka-backup validate --deep) | |
| 12 | RPO and RTO targets are defined per topic tier and documented | |
| 13 | Consumer offset recovery has been tested and is part of the restore procedure | |
| 14 | DR drills are conducted at least quarterly with documented results | |
| 15 | Backup storage is geographically separated from the primary Kafka cluster |
Pillar subtotal: ___ / 15
Performance Efficiency
| # | Check | Score (0–3) |
|---|---|---|
| 16 | Backup throughput has been benchmarked and meets RPO requirements | |
| 17 | Compression algorithm and level have been optimised for your data formats | |
| 18 | kafka-backup is co-located with Kafka brokers (same AZ/region) | |
| 19 | Compute resources are right-sized based on measured utilisation | |
| 20 | Restore performance has been benchmarked and meets RTO requirements |
Pillar subtotal: ___ / 15
Cost Optimisation
| # | Check | Score (0–3) |
|---|---|---|
| 21 | Storage lifecycle policies are active (tiering from Standard → IA → Glacier) | |
| 22 | Retention policies are defined per topic tier and enforced automatically | |
| 23 | Backup costs are tracked, tagged, and attributed to teams or projects | |
| 24 | VPC endpoints are used for storage access (no public internet transfer costs) | |
| 25 | Compute is right-sized and scales down when not actively backing up |
Pillar subtotal: ___ / 15
Sustainability
| # | Check | Score (0–3) |
|---|---|---|
| 26 | Compute resources scale down or terminate when not in use | |
| 27 | Topic filtering excludes unnecessary topics from backup | |
| 28 | Cold storage tiers are used for long-term retention | |
| 29 | Compression is enabled to reduce storage and network resource consumption |
Pillar subtotal: ___ / 12
Total Score
| Pillar | Score |
|---|---|
| Operational Excellence | ___ / 15 |
| Security | ___ / 15 |
| Reliability | ___ / 15 |
| Performance Efficiency | ___ / 15 |
| Cost Optimisation | ___ / 15 |
| Sustainability | ___ / 12 |
| Total | ___ / 87 |
Next Steps
Based on your score, prioritise improvements in the lowest-scoring pillars:
- Identify the pillar with the lowest score — this is your highest-risk area
- Review the corresponding pillar page for detailed best practices and implementation guidance
- Start with the highest-impact, lowest-effort items — typically monitoring (OE-03), encryption at rest (SEC-02), and backup validation (REL-01)
- Set a target score for your next quarterly assessment
- Track progress over time and celebrate improvements
If your assessment reveals critical gaps, the Reference Architectures provide proven deployment patterns you can adopt. For Enterprise features like encryption, RBAC, and audit logging, contact OSO for a consultation.