Sharded Services: Scaling Data Beyond a Single Machine
Core Concept
- Sharding = splitting data/work across multiple machines (shards)
- Each shard handles only a subset of requests
- A router/load balancer decides which shard handles each request
Sharding vs Replication
| Feature | Replication | Sharding |
|---|---|---|
| Data | Same everywhere | Split across nodes |
| Use case | Stateless services | Stateful / large data |
| Goal | Availability + scaling | Data size + scaling |
Replication = copy
Sharding = divide
Why Sharding Is Needed
1. Scale beyond a single machine
- Data too large → cannot fit in one server
- Sharding distributes storage across nodes
2. Better resource utilization
- Each shard stores unique data
- Avoids duplication seen in replicated caches
Example:
- Replicated cache → low memory efficiency
- Sharded cache → much higher utilization
3. Isolation (bonus use case)
- Protect system from “poison requests”
- Failure impacts only one shard, not entire system
Sharded Cache Insights
1. Improves performance
- Higher cache hit rate
- Faster responses
- Reduces backend load
2. Cache is critical
- If a shard fails:
- Requests mapped to it → cache miss
- Backend load increases
Cache affects:
- Throughput (RPS)
- Latency
3. Warm-up problem
- New shard starts empty
- Needs time to build cache
Deployments can temporarily reduce performance
Replicated Sharding (Best of Both Worlds)
- Each shard itself is replicated
- Combines:
- Sharding (data partitioning)
- Replication (availability)
Benefits:
- No single shard failure
- Better reliability
- Supports scaling per shard
Sharding Function (CRITICAL)
Shard = hash(request) % N
Key properties:
- Deterministic → same request → same shard
- Uniform → even distribution across shards
Choosing the Right Shard Key
Bad choice:
- Entire request → too specific
Better:
- Relevant fields only (e.g., request path)
Best:
- Depends on use case
- Example:
pathonly → efficient cachingcountry + path→ geo-aware caching
Wrong key = bad performance or incorrect results
Consistent Hashing (VERY IMPORTANT)
Problem:
- Changing shards (e.g., 10 → 11) → massive reshuffling
Solution:
- Consistent hashing
- Minimizes data movement
- Only small % of keys remapped
Critical for scaling systems smoothly
Hot Shards Problem
- Some shards get more traffic (uneven load)
- Example:
- Viral content → one shard overloaded
Solution:
- Hot sharding
- Scale that shard independently
- Replicate hot shards
Mental Model
[ Router ]
↓
[ Shard A ] [ Shard B ] [ Shard C ]
(each handles different data)
Trade-offs
Pros
- Scales data beyond single machine
- Efficient resource usage
- Isolation of failures
- Improved performance (via caching)
Cons
- More complex design
- Harder rebalancing
- Requires careful shard key selection
- Cache warm-up issues
One-line Summary
Sharding distributes data across multiple machines so each handles a subset of requests, enabling systems to scale beyond single-node limits—but requires careful design of shard keys and hashing to work efficiently.
#Distributed Systems #System Design #Sharding #Consistent Hashing #Scaling