MSK KRaft Migration Precheck Codes

The precheck command performs read-only analysis of both clusters and reports findings at three severity levels. Precheck is free — no license required.

kafka-backup migrate msk-kraft precheck --config migration.yaml

Severity Levels

Severity	Code prefix	Effect
Blocker	B##	Migration cannot proceed. Must be resolved first.
Warning	W##	Migration can proceed, but review the finding.
Info	I##	Informational. No action required.

Blockers

B02: MSK Serverless Cluster

Message: "{which} MSK Serverless; this migrator only operates on MSK Provisioned clusters"

Cause: One or both cluster ARNs point to MSK Serverless clusters. MSK Serverless is KRaft-only by construction — there is no ZK variant to migrate from.

Fix: Use MSK Provisioned cluster ARNs. The source must be ZK-mode Provisioned, the target must be KRaft-mode Provisioned.

B03: Source Not ZooKeeper Mode

Message: "source cluster metadata mode is {mode}, expected ZOOKEEPER"

Cause: The source cluster is already running in KRaft mode. There is nothing to migrate.

Fix: Verify the source ARN points to a ZooKeeper-mode cluster.

B04: Target Not KRaft Mode

Message: "target cluster metadata mode is {mode}, expected KRAFT"

Cause: The target cluster is not running in KRaft mode.

Fix: Provision the target MSK cluster with KRaft mode enabled (requires Kafka 3.7+).

B05: Source and Target Are the Same Cluster

Message: "source and target ARNs are identical"

Cause: Both source.cluster_arn and target.cluster_arn point to the same cluster. In-place migration is not supported.

Fix: Create a separate KRaft-mode MSK cluster for the target.

B06: Target Kafka Version Too Low

Message: "target Kafka version {version} is below minimum 3.7 required for KRaft on MSK"

Cause: The target cluster is running a Kafka version that does not support KRaft on MSK.

Fix: Upgrade the target MSK cluster to Kafka 3.7.x or later.

B07: Backup S3 Bucket Not Reachable

Message: "backup S3 bucket '{bucket}' not reachable: {error}"

Cause: The HeadBucket API call failed on the backup S3 bucket. Either the bucket does not exist or the caller lacks permissions.

Fix:

Create the bucket: aws s3 mb s3://<bucket> --region <region>
Ensure the migration runner's IAM role has s3:HeadBucket, s3:GetObject, s3:PutObject, s3:ListBucket on this bucket

B08: Evidence S3 Bucket Not Reachable

Message: "evidence S3 bucket '{bucket}' not reachable: {error}"

Cause: Same as B07, but for the evidence bucket.

Fix: Same as B07. If using S3 Object Lock, ensure the bucket was created with Object Lock enabled (cannot be added retroactively).

B09: Source Kafka Not Reachable

Message: "source Kafka protocol not reachable: {error}"

Cause: Cannot connect to the source cluster's bootstrap servers or fetch metadata.

Fix:

Verify bootstrap servers are correct (check MSK console or aws kafka get-bootstrap-brokers)
Check security group ingress rules — the migration runner must reach the broker ports
Verify auth mode matches the cluster's authentication configuration
For SCRAM: verify the username/password are correct and the SCRAM secret exists in AWS Secrets Manager

B10: Target Kafka Not Reachable

Message: "target Kafka protocol not reachable: {error}"

Cause: Same as B09, but for the target cluster.

Fix: Same as B09.

B11: Target message.max.bytes Too Small

Message: "target broker message.max.bytes={value} is below the largest source topic's effective max.message.bytes={max} (topic '{topic}') — replay would fail with RecordTooLargeException"

Cause: The target cluster's message.max.bytes broker setting is smaller than the largest message size allowed by any source topic. During restore, oversized records would be rejected.

Fix: Raise the target broker's message.max.bytes to at least match the source floor. Update the MSK cluster configuration:

aws kafka update-cluster-configuration \
  --cluster-arn <target-arn> \
  --configuration-info '{"Arn":"<config-arn>","Revision":<N>}' \
  --current-version <cluster-version>

B12: Target replica.fetch.max.bytes Too Small

Message: "target broker replica.fetch.max.bytes={value} is below the largest source topic's effective max.message.bytes={max} (topic '{topic}') — replication would stall on oversized batches"

Cause: The target's inter-broker replication cannot handle the largest messages from source.

Fix: Raise the target broker's replica.fetch.max.bytes alongside message.max.bytes.

B13: Reverse Replication Not Implemented

Message: "cutover.reverse_replication_enabled=true, but reverse replication is not implemented"

Cause: The config enables a feature that is not yet available.

Fix: Set cutover.reverse_replication_enabled: false in your config. Post-cutover rollback to the source cluster is a manual procedure.

Warnings

W01: Target Has Fewer Brokers

Message: "target has {target} brokers but source has {source} — consider scaling up before seed"

Cause: The target cluster has fewer brokers than the source. Topics with replication factor equal to source broker count may not be replicable.

Action: Consider scaling up the target cluster before migration.

W02: Cross-Region Migration

Message: "source region {source} ≠ target region {target} — seed + tail will incur egress bandwidth cost"

Cause: Source and target are in different AWS regions. Data transfer between regions incurs egress charges.

Action: Review the cost estimate from plan --format cost. Consider whether the data transfer cost is acceptable.

W03: KMS Key Configured

Message: "KMS key ARN set on backup channel — CMK access is not verified by this precheck phase; ensure the caller has kms:Encrypt/Decrypt/GenerateDataKey"

Cause: A custom KMS key is configured for S3 encryption. Precheck does not verify KMS permissions.

Action: Ensure the migration runner's IAM role has kms:Encrypt, kms:Decrypt, and kms:GenerateDataKey on the specified KMS key ARN.

W04: Message Size Check Skipped

Message: "could not verify target message-size floor ({reason}) — ensure target message.max.bytes and replica.fetch.max.bytes ≥ largest source topic's effective max.message.bytes"

Cause: The DescribeConfigs API call failed for source or target brokers, but the brokers are reachable. This is a fail-open scenario.

Action: Manually verify that the target's message.max.bytes and replica.fetch.max.bytes are sufficient.

W05: Static Consumer Group Members

Message: "source cluster has static consumer-group members ({summary}). Post-cutover, these consumers MUST restart against the target with the same group.instance.id values..."

Cause: Some consumer groups use static membership (group.instance.id). These consumers must reconnect to the target with identical instance IDs to avoid a full group rebalance.

Action: Ensure application deployments preserve group.instance.id values when switching to the target cluster.

W06: Transactional Producers Detected

Message: "source cluster has {total} transactional producer(s)..."

Cause: Transactional state (producer ID + epoch) does not migrate. Exactly-once guarantees do not span the cutover boundary.

Action: Applications using transactions must call initTransactions() after reconnecting to the target. Active transactions should be drained on source before pressing cutover.

W07: Log-Compacted Topics

Message: "{count} source topic(s) use cleanup.policy=compact..."

Cause: Compacted topics may have records deleted between seed and tail phases. The validation suite treats empty fetches and drift on compacted topics as warnings instead of failures.

Action: If bit-for-bit parity is required for compacted topics, run your own diff after finalize.

W08: SCRAM Target Needs Pre-Provisioned Users

Message: "target uses SCRAM-SHA-512 — SCRAM user credentials cannot be read via the Kafka protocol..."

Cause: The target cluster uses SCRAM authentication. SCRAM user credentials (stored in AWS Secrets Manager) cannot be read or copied programmatically. If the same users don't exist on the target, copied ACLs will reference unauthenticated principals.

Action: Pre-provision all SCRAM users on the target cluster before cutover. Use aws kafka batch-associate-scram-secret to associate the same Secrets Manager secrets.

W09: MSK Internal ACLs Will Be Filtered

Message: "{count} source ACL binding(s) reference MSK/Kafka internal principals or resources and will be filtered during ACL copy"

Cause: Some ACL bindings on the source reference internal principals (e.g., User:ANONYMOUS) or internal resources (__consumer_offsets). These are managed by MSK automatically and should not be copied.

Action: No action needed. The filtered bindings are logged for transparency.

W10: Finite Delete-Retention Topics

Message: "{count} source topic(s) use finite delete retention ({topic}({retention_ms}ms)...). SEED restores original CreateTime timestamps, so the target broker may advance log-start before cutover if old restored records become retention-eligible. Temporarily extend topic retention for the migration window, or rely on the cutover offset-floor guard to block the client switch if truncation occurs."

Cause: One or more source topics use cleanup.policy=delete with finite retention.ms. Kafka retention uses the record timestamp when message.timestamp.type=CreateTime, so restored historical records may become immediately retention-eligible on the target.

Action: Temporarily extend retention for affected topics during the migration window, or set retention to -1 until finalize completes. Keep the target offset-floor guard enabled; it verifies target log-start offsets before READY_FOR_CLIENT_SWITCH and blocks the switch if the target has already truncated copied data.

Example topic configuration that triggers this warning:

cleanup.policy=delete
message.timestamp.type=CreateTime
retention.ms=604800000
segment.ms=604800000

Info

I01: IAM Target — ACLs Emitted as Access Map

Message: "target is IAM-auth — ACLs will be emitted as access-map.json for customer IaC to translate to IAM policies (tool does not apply IAM)"

Cause: The target uses IAM authentication. Kafka ACLs don't apply on IAM-auth clusters. Instead, the tool generates an access-map.json that maps source ACL principals and permissions to the IAM policies you need to create.

Action: After migration, apply the generated IAM policies using your infrastructure-as-code tooling (Terraform, CloudFormation, CDK).

Next Steps

Production Migration Runbook — step-by-step guide
Configuration Reference — tune precheck-related settings
Troubleshooting — post-precheck error resolution

Severity Levels​

Blockers​

B02: MSK Serverless Cluster​

B03: Source Not ZooKeeper Mode​

B04: Target Not KRaft Mode​

B05: Source and Target Are the Same Cluster​

B06: Target Kafka Version Too Low​

B07: Backup S3 Bucket Not Reachable​

B08: Evidence S3 Bucket Not Reachable​

B09: Source Kafka Not Reachable​

B10: Target Kafka Not Reachable​

B11: Target message.max.bytes Too Small​

B12: Target replica.fetch.max.bytes Too Small​

B13: Reverse Replication Not Implemented​

Warnings​

W01: Target Has Fewer Brokers​

W02: Cross-Region Migration​

W03: KMS Key Configured​

W04: Message Size Check Skipped​

W05: Static Consumer Group Members​

W06: Transactional Producers Detected​

W07: Log-Compacted Topics​

W08: SCRAM Target Needs Pre-Provisioned Users​

W09: MSK Internal ACLs Will Be Filtered​

W10: Finite Delete-Retention Topics​

Info​

I01: IAM Target — ACLs Emitted as Access Map​

Next Steps​