Complete System Design Guide: Scalability, Networking, Databases, Web Communication & Messaging

ยท

3 min read

Vertical Scaling vs. Horizontal Scaling

  • Vertical Scaling (Scaling Up): Upgrading the existing server by adding more CPU, RAM, or storage. It is simpler but has hardware limits.

  • Horizontal Scaling (Scaling Out): Adding more servers to distribute the load. It improves fault tolerance and scalability but requires load balancing.

Networking and Distribution

Load Balancers

  • Distribute traffic across multiple servers to ensure no single server is overwhelmed.

  • Can be software-based (Nginx, HAProxy) or hardware-based.

  • Types: Round-robin, Least Connections, IP Hash, etc.

Content Delivery Networks (CDN)

  • A network of distributed servers that cache content closer to users to reduce latency.

  • Examples: Cloudflare, Akamai, AWS CloudFront.

Caching

  • Stores frequently accessed data to reduce database queries and improve speed.

  • Types: Client-side (browser cache), Server-side (Redis, Memcached), CDN caching.

IP Address

  • A unique identifier assigned to each device on a network.

  • Types: IPv4 (e.g., 192.168.1.1), IPv6 (e.g., 2001:db8::ff00:42:8329).

TCP/IP

  • TCP (Transmission Control Protocol): Ensures reliable, ordered, and error-checked delivery.

  • IP (Internet Protocol): Handles addressing and routing packets.

Domain Name System (DNS)

  • Converts human-readable domain names (e.g., google.com) into IP addresses.

  • Works with caching to speed up lookups.

Web Communication

HTTP

  • Protocol for web communication. Uses request methods like GET, POST, PUT, DELETE.

  • Stateless, meaning each request is independent.

REST

  • Representational State Transfer (REST) is an architectural style for APIs.

  • Uses stateless HTTP with resources identified by URLs.

  • Responses often use JSON or XML.

GraphQL

  • A query language for APIs that allows clients to request only the data they need.

  • Flexible compared to REST but can be complex.

gRPC

  • A high-performance, language-neutral framework for RPC (Remote Procedure Calls).

  • Uses Protocol Buffers (protobuf) instead of JSON for efficiency.

WebSockets

  • A bi-directional, persistent connection between client and server.

  • Used for real-time applications like chat apps and live notifications.

Databases and Storage

SQL

  • Structured Query Language for relational databases like PostgreSQL, MySQL, SQL Server.

  • Uses structured schemas and tables.

ACID

  • Ensures database transactions are Atomic, Consistent, Isolated, Durable (ACID).

  • Prevents issues like data corruption.

NoSQL

  • Non-relational databases like MongoDB, Cassandra, DynamoDB.

  • Suitable for unstructured data and high scalability.

Sharding

  • Splitting large databases into smaller partitions (shards) to improve performance.

  • Each shard handles a subset of data.

Replication

  • Copying data across multiple servers to improve redundancy, fault tolerance, and read performance.

  • Types: Master-Slave, Master-Master.

CAP Theorem

  • Consistency, Availability, Partition Toleranceโ€”a distributed system can only guarantee two out of three.

  • CP (Consistency + Partition Tolerance): MongoDB

  • AP (Availability + Partition Tolerance): Cassandra

Messaging and Queues

Message Queues

  • Asynchronous communication between services using a queue.

  • Prevents system overload and improves scalability.

  • Examples: Kafka, RabbitMQ, AWS SQS.

Let me know if you need further refinements! ๐Ÿš€

ย