System Design Mindset

This page teaches the way of thinking you should bring to any High-Level Design (HLD) question — whether it’s an interview or a real-world architecture discussion.

Good system design is mostly about clarity, structure, and trade-offs. Start with what the system must do, then decide how well it must do it, then design incrementally with clear checkpoints.

1. Quick summary (the short elevator version)

Clarify requirements — ask about functional and non-functional requirements (NFRs).
Identify constraints & assumptions — resources, deadlines, legacy tech.
Estimate workload — QPS, throughput, storage, latency targets.
Sketch a simple design (monolith) that satisfies requirements.
Find bottlenecks and iterate: caching, replication, partitioning, async.
Call out trade-offs at every step and show monitoring & failure plans.

2. Functional vs Non-Functional Requirements

Functional requirements (the what): concrete features and behaviors.

Example items: create user, upload image, send message, redirect short URL.

Non-functional requirements (NFRs) (the how well): quality attributes and constraints.

Scalability (e.g., must handle 100k QPS).
Latency (e.g., < 200 ms tail latency).
Availability (e.g., 99.95% uptime).
Consistency (strong vs eventual).
Durability, security, cost, maintainability, compliance.

Interview habit: always repeat requirements back and ask which NFRs are critical. e.g., “Do you want strict consistency for this API, or is eventual consistency acceptable?”

3. Clarify constraints & assumptions (fast)

Ask explicit questions before designing. Typical constraints:

Time: do you have 15 minutes to design or 45 minutes?
Budget: can we use managed cloud services or must be cost-sensitive?
Existing systems: any legacy DBs or languages?
Geographic: single region or global users?
Team: how many engineers for maintenance?

State assumptions out loud if interviewer doesn’t specify them:

“I’ll assume 1M daily active users, ~10M requests/day, and low-latency reads are most important. If that’s wrong, tell me.”

4. Find bottlenecks — what to look for

Bottlenecks depend on workload. Common hotspots:

CPU — expensive compute (image processing, ML inference).
Memory — large in-memory caches, per-connection state.
Disk / IOPS — DB writes, backups, compaction.
Network — bandwidth-heavy workloads (video).
DB connections — connection limits & locks.
Single Point of Failure — single DB, single load balancer, etc.

How to present this: show a quick table linking requirements → likely bottlenecks. Example: "If system is write-heavy, DB write throughput is the likely bottleneck."

5. Step-by-step scaling roadmap (the interview-friendly flow)

Start simple — single app server + single DB (monolith).
Optimize vertically — bigger machine, better disks (short-term).
Add cache & read replicas — reduce DB load for reads.
Partition (shard) and distribute — split data when DB won't fit/handle writes.
Move to microservices only when domains require independent scaling.
Geo-distribute with replication & edge caches for global users.
Automate & observe — add autoscaling, monitoring, distributed tracing.

Always justify each move: don’t jump to microservices or sharding without explaining why.

6. Decision cues — when to do what

Use caching when reads >> writes and latency is critical.
Use read replicas when read scaling is needed but writes are limited.
Shard when dataset size or write throughput exceeds single-node capacity.
Use specialized DBs when problem fits (graph for relationships, time-series for metrics).
Use asynchronous processing for non-critical or long-running tasks (image processing, emails).
Prefer eventual consistency in user-facing features where slight staleness is acceptable (feeds), but require strong consistency for money transfers.

7. A small worked example — URL Shortener (brief)

Clarify requirements
- Functional: create short URL, redirect, track analytics.
- NFRs: low latency on redirects (<50ms), high availability, 100M short links stored.
Capacity sketch (example arithmetic shown step-by-step)
Suppose: 1,000,000 daily active users, each performs 10 redirects/day → total requests/day = 1,000,000 × 10 = 10,000,000 requests/day.
Convert to requests/sec:
- Seconds/day = 24 × 3600 = 86,400.
- Requests/sec = 10,000,000 / 86,400.
  Step-by-step:
  - 24 × 3600 = 86,400.
  - Divide: 10,000,000 ÷ 86,400 = 115.740740...
- So average QPS ≈ 116 req/s. Plan for peak: use a 10× factor → ~1,160 req/s.
From here:
- One small web server can serve thousands of cached redirects; use CDN for static redirect results.
- Data size: 100M short-links × 100 bytes/link ≈ 10,000,000,000 bytes ≈ 10 GB (double-check assumptions for metadata).
Simple architecture
- A web layer with CDN for redirects.
- Key-value store (Redis/DynamoDB) for short->long mapping (fast reads).
- A write path that writes to durable DB and populates cache.
- Async pipeline for analytics.
Trade-offs
- Storing canonical mapping in a durable DB vs caching for latency.
- Collision handling (hash length decisions).
- Analytics can be eventually consistent.

8. Design process checklist (step-by-step to use in interviews)

Clarify functional & non-functional requirements (repeat them).
State constraints & assumptions (and ask clarifying questions).
Estimate load (QPS, storage, bandwidth). Show any calculations briefly.
Draw a minimal architecture that satisfies requirements.
Identify bottlenecks in that design.
Propose targeted optimizations (cache, replicas, partitions).
Discuss trade-offs (cost, complexity, consistency).
Add observability & failure plan (monitoring, retries, graceful degradation).
Summarize — repeat final architecture and why it meets the key NFRs.

9. Interview tips — what interviewers want to hear

Ask clarifying questions before designing.
Speak in a structured way: requirements → constraints → architecture → scaling plan.
Show numbers (even rough estimations) — interviewers like capacity reasoning.
Call out trade-offs explicitly. Say which requirement you’re prioritizing (latency vs durability vs cost).
Start simple and iterate — show incremental improvements as load grows.
Mention observability & testing — how you’ll detect and recover from failures.
Use real-world analogies and case studies where relevant.

10. Quick checklist to keep on hand

Have I asked about functional & non-functional requirements? ✅
Have I stated assumptions and constraints? ✅
Did I estimate QPS/throughput and storage? ✅
Is my initial design simple and end-to-end? ✅
Did I point out the major bottlenecks and fixes? ✅
Did I call out trade-offs explicitly? ✅
Did I add monitoring and failure handling? ✅

11. Next steps / links

Continue to Workload Estimation for worked examples of capacity calculations.
See practical patterns in Caching strategies and Sharding & Replication.

System Design Mindset ​

1. Quick summary (the short elevator version) ​

2. Functional vs Non-Functional Requirements ​

3. Clarify constraints & assumptions (fast) ​

4. Find bottlenecks — what to look for ​

5. Step-by-step scaling roadmap (the interview-friendly flow) ​

6. Decision cues — when to do what ​

7. A small worked example — URL Shortener (brief) ​

8. Design process checklist (step-by-step to use in interviews) ​

9. Interview tips — what interviewers want to hear ​

10. Quick checklist to keep on hand ​

11. Next steps / links ​