Pages

Wednesday, December 31, 2025

Core System Design Concepts

The 20 most important concepts for designing large-scale software systems. It is organized into four main sections: Scaling, Networking, APIs, and Databases.

1. Scaling: Handling More Users

When an app gets too many users for one computer to handle, you have two choices:

  • Vertical Scaling: Buying a bigger, faster computer (more RAM/CPU). It's easy but has a limit.

  • Horizontal Scaling: Adding more of the same-sized computers. This is better because you can scale almost forever and if one computer breaks, the others keep working (Redundancy).

  • Load Balancers: A "traffic cop" server that sits in front of your computers and makes sure work is spread out evenly so no single server gets overwhelmed.

  • Content Delivery Networks (CDN): A global network of servers that store copies of your files (images/videos) close to where the user lives, making the app feel much faster.

2. Networking: How Computers Talk

  • IP Address: The unique "digital home address" for every device on the internet.

  • TCP/IP: The rules for sending mail on the internet. It breaks files into small "packets," numbers them, and ensures they are put back together correctly at the end.

  • DNS (Domain Name System): The "phonebook" of the internet. It translates a name like google.com into the IP address the computer needs.

3. Caching & Communication

  • Caching: Saving a copy of data in a "faster" spot so you don't have to fetch it from the original source again. (e.g., keeping your keys in your pocket rather than walking back to the bedroom to find them).

  • HTTP: The specific language web browsers use to talk to servers. It uses a "shipping label" (Header) and the "package contents" (Body).

  • WebSockets: Unlike regular web requests that ask for data and hang up, WebSockets keep the line open. This is used for things like Chat Apps where messages need to pop up instantly.

4. API Patterns (The Rules for Data)

  • REST: The most common standard. It’s simple and predictable (e.g., Error 404 means "Not Found").

  • GraphQL: Instead of the server deciding what data to give you, the user asks for exactly what they need. This prevents "over-fetching" useless data.

  • gRPC: A high-speed system used mainly for servers talking to other servers. It uses "binary" (shorthand) instead of text to move data faster.

5. Databases: Storing Data

  • SQL (Relational): Data is organized into neat rows and tables. It follows ACID rules, which guarantee that transactions (like bank transfers) are "all-or-nothing" and perfectly accurate.

  • NoSQL (Non-Relational): These drop the strict "neat table" rules to make it much easier to scale horizontally across thousands of machines.

  • Sharding: Breaking one giant database into smaller pieces (shards) and spreading them across different computers.

  • Replication: Keeping identical copies of your database in different parts of the world so data isn't lost if a building loses power.

6. Message Queues

Think of these as a "to-do list" for your servers. If your system is getting more work than it can handle right now, it puts the tasks in a Message Queue so it can finish them one by one at its own pace without crashing.

Key Takeaway: System design is essentially the art of finding the best way to move, store, and protect data while making sure the app stays fast as it grows.


Here is a clear, easy-to-understand summary of the transcript in simple language, focusing on the big ideas without heavy jargon.


Scaling Applications

Vertical Scaling

  • Add more power (CPU, RAM) to a single server

  • Easy to do, but has limits

  • Still a single point of failure

Horizontal Scaling

  • Add more servers (replicas)

  • Requests are split across servers

  • Much more scalable and fault-tolerant

Real-world example:
Instead of hiring one super-strong worker, hire multiple average workers.


Load Balancers

  • A load balancer sits in front of servers

  • It distributes incoming requests evenly

  • Prevents one server from getting overloaded

  • Can route users to the nearest server

Example: Traffic police directing cars to different lanes.


Content Delivery Networks (CDNs)

  • Used for static content like images, videos, CSS, JS

  • Copies content to servers around the world

  • Users get data from the closest server

Example: Watching Netflix from a nearby server instead of one far away.


Caching (Making Things Faster)

  • Stores frequently used data closer to the user

  • Reduces repeated network calls

  • Exists at many levels:

    • Browser cache

    • Memory cache

    • CPU cache

Example: Keeping frequently used files on your desk instead of in storage.


Networking Basics

IP Address

  • Every device has a unique identifier on the internet

TCP/IP

  • Rules for sending data reliably

  • Breaks data into packets

  • Resends missing packets

Example: Sending a book page-by-page with page numbers.


Domain Name System (DNS)

  • Converts website names (neetcode.io) into IP addresses

  • Cached so it doesn’t need to be looked up every time

Example: Phone contact name → phone number.


HTTP (How the Web Works)

  • Built on top of TCP

  • Uses requests and responses

  • Each has:

    • Headers (metadata)

    • Body (actual data)

Example: Mailing a package with a shipping label and contents.


API Design Patterns

REST

  • Most common API style

  • Uses HTTP methods and status codes

  • Stateless and simple

GraphQL

  • Fetch exactly the data you need

  • Multiple resources in one request

  • Avoids over-fetching

gRPC

  • Faster, binary-based communication

  • Mostly used between servers

  • Less human-readable than REST

WebSockets

  • Real-time, two-way communication

  • Used in chat apps, live updates

Example:

  • REST = ordering one item at a time

  • GraphQL = ordering a custom meal in one go

  • WebSockets = live phone call instead of letters


Databases

SQL (Relational Databases)

  • Structured tables (rows and columns)

  • Fast queries

  • ACID properties:

    • Atomicity

    • Consistency

    • Isolation

    • Durability

Best for financial and transactional data.


NoSQL Databases

  • More flexible structure

  • Easier to scale

  • Drops strict consistency rules

Best for large-scale, distributed systems.


Sharding & Replication

Sharding

  • Split data across multiple databases

  • Each server stores a portion of data

Replication

  • Create copies of data

  • Improves read performance

  • Types:

    • Leader–Follower

    • Leader–Leader

Example:
Sharding = splitting a book into chapters
Replication = making photocopies


CAP Theorem

In distributed systems, you can only fully guarantee two out of three:

  • Consistency

  • Availability

  • Partition tolerance

Trade-offs are unavoidable.


Message Queues

  • Store messages temporarily

  • Handle traffic spikes

  • Decouple system components

Example:
Order queue in a restaurant when the kitchen is busy.


Final Takeaway

System design is about efficiently storing, moving, and scaling data while handling failures gracefully.

Mastering these concepts helps you:

  • Build scalable systems

  • Avoid bottlenecks

  • Clear system design interviews


Visual Takeaway
Users
 ↓
Load Balancer
 ↓
Servers
 ↓
Cache
 ↓
Database
 ↓
Message Queue


System Design Interview Questions & Answers 


Vertical vs Horizontal Scaling

Q: What is the difference between vertical and horizontal scaling?

A:

  • Vertical scaling increases the resources (CPU, RAM) of a single server. It’s easy but limited and creates a single point of failure.

  • Horizontal scaling adds more servers and distributes traffic among them. It’s more scalable, fault-tolerant, and preferred for large systems.

Interviewer looks for: Trade-offs and scalability limits.


 Why Is Horizontal Scaling Preferred?

Q: Why do most large systems prefer horizontal scaling?

A:
Because it allows near-infinite growth, improves fault tolerance, and avoids hardware limits. If one server fails, others continue serving requests.

Interviewer looks for: Reliability + scalability reasoning.


 What Is a Load Balancer?

Q: What does a load balancer do?

A:
A load balancer distributes incoming traffic across multiple servers to prevent overload and improve availability. It can also route users to the nearest server.

Common algorithms: Round-robin, least connections, hashing.


 What Problem Does a CDN Solve?

Q: Why do we use a Content Delivery Network?

A:
A CDN serves static content from servers close to users, reducing latency and load on the origin server.

Example: Images and videos served from nearby locations.


 What Is Caching and Why Is It Important?

Q: How does caching improve performance?

A:
Caching stores frequently accessed data closer to the user, reducing expensive network or database calls and speeding up responses.

Types: Browser, memory (Redis), CPU cache.


 What Is an IP Address?

Q: What is an IP address and why is it needed?

A:
An IP address uniquely identifies a device on a network, allowing computers to locate and communicate with each other.


Explain TCP in Simple Terms

Q: Why is TCP considered reliable?

A:
TCP breaks data into packets, ensures they arrive in order, and resends any missing packets, guaranteeing reliable data transfer.


 What Is DNS?

Q: What role does DNS play in the internet?

A:
DNS translates human-readable domain names (like google.com) into IP addresses so computers can find servers.


 Why Do We Use HTTP Over TCP?

Q: Why isn’t TCP enough for web communication?

A:
TCP is low-level. HTTP adds structure like request methods, headers, and status codes, making it easier for developers to build web applications.


 What Is REST?

Q: What are the key characteristics of REST APIs?

A:

  • Stateless

  • Uses HTTP methods

  • Standard status codes

  • Resource-based URLs


 REST vs GraphQL

Q: How is GraphQL different from REST?

A:
GraphQL allows clients to request exactly the data they need in a single request, avoiding over-fetching and multiple API calls common in REST.


 What Is gRPC and When Would You Use It?

Q: Why would you choose gRPC over REST?

A:
gRPC uses binary serialization (Protocol Buffers), making it faster and more efficient. It’s commonly used for internal, service-to-service communication.


 What Problem Do WebSockets Solve?

Q: Why not use HTTP for real-time apps?

A:
HTTP requires polling. WebSockets enable persistent, two-way communication, allowing real-time updates like chat messages.


 SQL vs NoSQL

Q: When would you choose SQL over NoSQL?

A:
Use SQL when you need strong consistency, transactions, and structured data (e.g., financial systems).

Use NoSQL for scalability, flexibility, and large distributed systems.


 What Does ACID Mean?

Q: Explain ACID properties.

A:

  • Atomicity: All or nothing

  • Consistency: Data rules enforced

  • Isolation: Concurrent transactions don’t interfere

  • Durability: Data persists after crashes


 What Is Sharding?

Q: How does sharding help scale databases?

A:
Sharding splits data across multiple machines using a shard key, allowing horizontal scaling and improved performance.


 What Is Replication?

Q: How does replication differ from sharding?

A:
Replication creates copies of data for availability and read scaling, while sharding splits data across servers.


 What Is CAP Theorem?

Q: Explain CAP theorem in simple terms.

A:
In a distributed system, you can fully guarantee only two of:

  • Consistency

  • Availability

  • Partition tolerance

Trade-offs are unavoidable.


 What Are Message Queues Used For?

Q: Why introduce a message queue?

A:
Message queues buffer traffic, handle spikes, decouple services, and enable asynchronous processing.

Examples: Kafka, RabbitMQ, AWS SQS.


No comments:

Post a Comment