Sharding vs Replication vs Partitioning
When databases grow beyond a single machine, three major scaling strategies come into play:
- Replication – copy the same data to multiple machines
- Partitioning – divide data logically or physically into segments
- Sharding – a type of partitioning across multiple machines
These concepts overlap, but they solve different problems. Understanding the distinction is critical for system design and interviews.
1. Replication
What is Replication?
Replication = keeping multiple copies of the same data on different nodes.
Types
Primary–Secondary (Master–Slave)
- Writes → Primary
- Reads → Replicas
- Example: MySQL read replicas
Multi-Primary (Multi-Master)
- Multiple nodes accept writes
- Requires conflict resolution (LWW, version vectors)
- Example: MongoDB Replica Sets, PostgreSQL BDR
Benefits
- High availability — failover to replicas
- Read scalability — distribute reads
- Disaster recovery — backups from replicas
- Geo-distribution — replicas close to users
Downsides
- Replication lag on replicas
- Conflict resolution issues (multi-master)
- Higher storage cost
2. Partitioning (General Concept)
What is Partitioning?
Partitioning = dividing a dataset into smaller logical or physical parts.
IMPORTANT:
Partitioning is a broad concept. Sharding is one specific type of partitioning across nodes.
Types of Partitioning
Vertical Partitioning (by columns)
- Table split into groups of columns
- Example: UserProfile → {user_basic_info}, {user_settings},
Horizontal Partitioning (by rows)
- Table divided into row groups
- Example: Orders partitioned by year
Functional Partitioning
- Split by service/function
- Example: Users DB, Orders DB, Analytics DB
Benefits
- Better data organization
- Faster queries on partitions
- Easier to archive old data
Downsides
- Still sits on one machine unless combined with sharding
- Application logic may need partition awareness
3. Sharding (Distributed Partitioning)
What is Sharding?
Sharding = horizontal partitioning across multiple machines.
Each shard contains a portion of data:
- Shard 1 → Users A–M
- Shard 2 → Users N–Z
Sharding Strategies
- Range-based — sequential ranges
- Hash-based — hash(key) → shard
- Directory-based — external mapping service
Benefits
- Write scalability — writes spread across shards
- Storage scalability — dataset no longer limited to one machine
- Parallelism — multiple shards process queries
Downsides
- Distributed joins are expensive
- Hotspots when one shard gets more load
- Shard rebalancing is non-trivial
- Higher application complexity (choosing shard key)
4. Combined Comparison
| Feature | Replication | Partitioning (General) | Sharding |
|---|---|---|---|
| Definition | Duplicate full dataset on multiple nodes | Split dataset logically/physically | Horizontal partitioning across machines |
| Improves | Availability, read scaling | Manageability, performance | Write scalability, storage capacity |
| Data on Each Node | Full copy | Portion (depends on type) | Portion (horizontal rows) |
| Failure Tolerance | High | Medium (depends on replication) | Low per shard (unless replicated) |
| Primary Use Case | Read-heavy load, HA | Large tables, organization | Write-heavy workloads, huge datasets |
| Routing Complexity | Low (any replica for reads) | Medium | High (must find the correct shard) |
5. When to Use What
Use Replication when:
- Read-heavy workloads
- High availability is needed
- Dataset fits on single machine
Use Partitioning when:
- Table gets too large
- Archival or performance optimization is required
- You want clean data organization
Use Sharding when:
- Dataset is too large for one server
- Writes overwhelm a single node
- You need parallelization across machines
6. Sharding + Replication + Partitioning (Real Systems)
Modern systems combine all three:
Example Architecture
- Partition data logically (e.g., user data, transactions, logs)
- Shard each partition across multiple hosts
- Replicate each shard for availability
Used by:
- MongoDB Sharded Clusters (sharding + replication)
- Cassandra (partitioning + replication)
- Google Spanner (sharding + synchronous replication)
- Instagram (PostgreSQL sharded + replicas)
7. Interview Tips
- Start by asking: Is the system read-heavy or write-heavy?
- Tie answers to scalability goals
- Mention shard-key selection
- Talk about hotspots and rebalancing
- Mention CAP/PACELC
👉 Sample Interview Answer:
"If the workload is read-heavy, I use replication to scale reads and improve availability. If writes or data size exceed a single machine, I use sharding. Partitioning helps organize and manage large tables. In real systems, I combine all three: partition the schema, shard horizontally across nodes, and replicate each shard for high availability."
8. Recap
- Replication = multiple copies → improves availability and read scaling.
- Partitioning = divide data → improves manageability and performance.
- Sharding = distributed partitioning → improves write scaling and storage.
- Real systems use all three for resilience and massive scalability.