Kafka Replication Across Data Centers: Architecture Patterns for Multi-DC

July 3, 2026 · 10 min read

The team behind OSO Kafka Backup

Kafka replication across data centers comes down to one architectural decision: run a single stretched cluster spanning your datacenters, or run independent clusters connected by asynchronous replication. Stretched clusters give you zero data loss but demand low-latency links. Separate clusters tolerate WAN conditions but accept a non-zero recovery point. This guide covers both designs, when each wins, and how to build the second one with MirrorMaker 2.

Key takeaway

Stretch a single cluster only when your datacenters are close enough for single-digit-millisecond round trips and you can place brokers in three locations. Everywhere else, run one cluster per datacenter and replicate asynchronously with MirrorMaker 2 — then size the link for peak produce throughput, not the average.

Why replicate Kafka across data centers?

A single-datacenter Kafka cluster has a single failure domain. replication.factor=3 protects you from losing a broker, not from losing the building. Teams add a second datacenter for four reasons:

Disaster recovery. Survive a datacenter failure with a standby cluster that already holds your data.
Latency and locality. Serve consumers from the datacenter nearest to them instead of hauling every read across a WAN.
Data residency. Keep regulated data in-country while replicating only the topics that are allowed to leave.
Migration. Move workloads between datacenters or into the cloud gradually, with both sides live during the transition (see the migration use cases).

Each reason pushes you toward one of two designs, and they behave very differently under failure.

Stretched cluster vs. cross-DC replication

A stretched cluster is one Kafka cluster whose brokers live in multiple datacenters. Partition replicas are spread across sites, and with acks=all plus a correctly set min.insync.replicas, every write is confirmed in more than one datacenter before the producer moves on. Replication is synchronous, so the recovery point objective (RPO) is zero.

Cross-DC replication runs an independent cluster in each datacenter and copies topics between them with a replication tool — MirrorMaker 2, Confluent Replicator, or Cluster Linking. Replication is asynchronous: the source cluster acknowledges writes locally, and the copy follows milliseconds to seconds later. Whatever has not yet crossed the wire when the datacenter fails is your RPO.

	Stretched cluster	Cross-DC replication
Clusters	One, spanning DCs	One per DC
Replication	Synchronous (in-cluster)	Asynchronous (MM2 or similar)
RPO on DC loss	Zero	Replication lag at failure time
Produce latency	Includes inter-DC round trip	Local only
Network partition	Can halt writes cluster-wide	Each cluster keeps working
Offset handling	Native — one cluster	Requires offset translation
Latency requirement	Single-digit ms round trips	Tolerates WAN latency
Typical fit	Metro-area DCs, three sites	Everything else

The latency requirement is the deciding factor in practice. Every produce with acks=all in a stretched cluster waits on the slowest in-sync replica, so inter-DC round-trip time lands directly on your producers. Metro-area datacenters connected by dark fiber can absorb that. Datacenters in different regions cannot.

Quorum placement is the second constraint. With brokers in only two sites, a network partition leaves you choosing between availability and consistency — neither half can safely claim the majority. Production stretched clusters need three locations, even if the third only hosts controller quorum members. If you cannot get three sites with low-latency links, that alone rules the stretched design out.

One consolation for stretched-cluster read traffic: since KIP-392, consumers can fetch from the closest replica by matching client.rack to broker.rack, which keeps read traffic inside a datacenter even though writes cross it.

Cross-DC replication patterns

Once you have chosen separate clusters, the topology question is which direction data flows:

Active-passive. One datacenter serves all traffic; the second receives a continuous copy and waits. Simple to reason about, and failover is a client-repointing exercise. This is the right default.
Active-active. Both datacenters produce and consume, with bidirectional replication between them. You gain locality and put the standby to work, and pay for it with loop prevention, topic-naming discipline, and conflict handling.
Aggregate (fan-in). Several source datacenters replicate into a central cluster for analytics or archival. Common in retail and IoT estates where edge sites produce locally.

Our geo replication guide compares these patterns in depth, including hub-and-spoke and mesh variants and their RTO and cost profiles. The short version: start active-passive, and let a proven requirement — not idle-standby guilt — move you to active-active.

Implementing cross-DC replication with MirrorMaker 2

MirrorMaker 2 (MM2) ships with Apache Kafka and runs on the Connect framework. A working active-passive flow from a dc1 cluster to a dc2 standby:

mm2.properties
clusters = dc1, dc2
dc1.bootstrap.servers = kafka-dc1.internal:9092
dc2.bootstrap.servers = kafka-dc2.internal:9092

# One-way replication: dc1 -> dc2
dc1->dc2.enabled = true
dc1->dc2.topics = orders.*, payments.*
dc1->dc2.topics.exclude = .*\.internal, __.*

# Translate consumer offsets so groups can fail over
emit.checkpoints.enabled = true
sync.group.offsets.enabled = true
sync.group.offsets.interval.seconds = 60

# Copies on the target cluster
dc2.replication.factor = 3

Four things deserve attention before you trust this in production:

Topic naming. MM2's default replication policy prefixes remote topics — orders from dc1 becomes dc1.orders on dc2. That prevents loops in bidirectional setups but means failover consumers must subscribe to the prefixed name. Kafka 3.0+ includes IdentityReplicationPolicy, which keeps names unchanged for one-way, active-passive flows.
Offset translation. Offsets are not comparable between clusters. MM2 checkpoints map each group's committed position from source to target, and sync.group.offsets.enabled applies the mapping automatically when the group is idle on the target. Verify translated offsets in a failover drill before an incident forces the question.
Where MM2 runs. Deploy the Connect workers in the target datacenter. Consuming over the WAN and producing locally degrades more gracefully when the link gets slow, and the standby side keeps operating if the primary datacenter is degraded.
Partition behavior. MM2 preserves partition counts and routes records to the same partition number on the target, so key-based ordering survives replication. It does not replicate every cluster detail — ACLs and quotas need their own process.

During a network partition, MM2 simply falls behind and catches up when the link returns. That is the asynchronous design working as intended: each cluster keeps serving its local clients, and your monitoring — not your producers — absorbs the disruption.

Network architecture for cross-DC Kafka

The replication link is a production dependency and deserves the same design attention as the clusters:

Bandwidth: size for peak, not average. The link must sustain your peak produce rate on replicated topics, with headroom for catch-up after an outage. A link sized to the average will fall behind at the daily peak and never recover before the next one. Enable compression.type=lz4 or zstd on the MM2 producer to cut wire bytes substantially for text-heavy payloads.
Encryption in transit. Cross-DC traffic leaves your building. Terminate TLS on the brokers (security.protocol: SASL_SSL or SSL on the cluster listeners), not just on a VPN wrapper, so the requirement survives network re-architecture.
Dedicated vs. shared links. Replication competing with general WAN traffic produces lag spikes that look like Kafka problems. If a dedicated link is not available, apply QoS so bulk replication cannot starve — or be starved by — other traffic.
Failover DNS. Clients should bootstrap through DNS names you can repoint (kafka-primary.example.com), with TTLs low enough to matter mid-incident. Rehearse the repointing; a failover plan that has never run is a hypothesis.

Monitor end-to-end lag as the timestamp delta between a record's append at the source and its availability at the target — connector task lag alone understates what consumers actually experience after failover.

Replication is not backup

Cross-datacenter replication protects against losing a datacenter. It does not protect against bad data, because it copies everything faithfully: a producer bug, a malformed schema, or an accidental topic deletion reaches the second datacenter in milliseconds. Both copies are now wrong, and neither can answer "restore this topic to yesterday 14:00."

That job needs an immutable, point-in-time copy outside both clusters. OSO Kafka Backup writes records, consumer group offsets, and topic configuration to object storage — S3, Azure Blob, GCS, or filesystem — and restores to millisecond precision on any cluster, including the DR cluster you just failed over to. A minimal backup configuration from the configuration reference:

mode: backup
backup_id: "dc1-orders-daily"

source:
  bootstrap_servers:
    - kafka-dc1.internal:9092

storage:
  backend: s3

Mature multi-DC estates run both layers: replication for availability when a site fails, backups for recoverability when the data itself is the problem.

Choosing your architecture

Work through the decision in this order:

Can you meet stretched-cluster physics? Three sites, single-digit millisecond round trips, and producers that can tolerate inter-DC latency on every acks=all write. If yes and RPO must be zero, stretch.
Otherwise, replicate asynchronously. One cluster per datacenter, MM2 running in the target site, active-passive until a workload proves it needs active-active.
Design the network deliberately. Peak-rate bandwidth, compression, TLS, and DNS failover are architecture, not afterthoughts.
Add the backup layer. Replication answers "what if the datacenter fails?" Backups answer "what if the data is wrong?" Production estates need both answers.

Network failures between datacenters are a certainty on a long enough timeline. The asynchronous design treats them as routine; the stretched design treats them as an event. Choose based on which failure conversation you would rather have.

Frequently asked questions

How do you replicate Kafka across data centers?

Either stretch one cluster across datacenters with rack-aware replica placement and synchronous replication, or run a separate cluster per datacenter and copy topics asynchronously with MirrorMaker 2 or a similar tool. Stretched clusters need low-latency links and three sites; asynchronous replication works over ordinary WAN links.

Can a single Kafka cluster span multiple data centers?

Yes, if the datacenters are close enough. A stretched cluster needs round-trip latency in the low single-digit milliseconds — typically metro-area distance — and brokers or quorum members in at least three locations so a network partition leaves a clear majority.

What is the difference between a stretched Kafka cluster and cross-DC replication?

A stretched cluster is one Kafka cluster whose replicas span datacenters, replicating synchronously with zero RPO but inter-DC latency on every acknowledged write. Cross-DC replication runs independent clusters connected by an asynchronous copier such as MirrorMaker 2, keeping produce latency local but accepting a recovery point equal to replication lag.

What bandwidth is needed for Kafka cross-datacenter replication?

Size the link for the peak produce throughput of the replicated topics, plus headroom to catch up after link outages. A link sized to average throughput falls behind at every daily peak. Compression on the replication producer (lz4 or zstd) reduces wire bytes significantly.

How does MirrorMaker 2 handle network partitions between data centers?

MM2 replication is asynchronous, so a partition causes it to fall behind rather than fail. Each cluster continues serving its local producers and consumers, and MM2 resumes from its committed source offsets and catches up when connectivity returns. Monitor end-to-end lag so you know the real recovery point during the gap.

Why replicate Kafka across data centers?​

Stretched cluster vs. cross-DC replication​

Cross-DC replication patterns​

Implementing cross-DC replication with MirrorMaker 2​

Network architecture for cross-DC Kafka​

Replication is not backup​

Choosing your architecture​