Basics

“Everything is a trade-off”

Distributed System → a network of computers that work together to perform task(s)
Characteristics of a Distributed System
- Availability → % uptime (ex: 99.99%)
- Reliability → remain operational despite component(s) failure, implies Availability
  - Both achieved by Replication of data and Redundancy of services
- Scalability / Scaling → cope with increased demand without drop in performance
  - Vertical Scaling - get a bigger machine, increase CPU, RAM, Storage, etc.
  - Horizontal Scaling - add more machines/nodes, require Load Balancing
- Efficiency
  - Latency / Response Time
  - Throughput / Bandwidth
- Manageability / Maintainability / Serviceability → simplicity and speed to repair system, impacts Availability
Theorems
- Definitions
  - Consistency → system maintains a consistent state across all nodes
  - Availability → system should always respond to requests regardless of the current state of the system
  - Partition Tolerance → system continues to function even if there is a network partition
  - Latency → time taken for system to respond to requests
- CAP → in the presence of a network partition, system must choose between Availability and Consistency
- PACELC → extension of CAP theorem, in the absence of network partitions, system must choose between Latency (Lower) and Consistency (Strong)
Consistency Patterns
- Weak Consistency
- Eventual Consistency → data replicated asynchronously, takes milliseconds
- Strong Consistency → data replicated synchronously
Availability Patterns
- Complementary Patterns: Fail-Over and Replication
- Fail-Over
  - Active-Passive → heartbeats sent between active and passive servers, if the heartbeat is interrupted, the passive server assumes the active server’s IP address and resumes service
  - Active-Active → both servers are managing traffic, spreading the load between them
    - Public Facing → DNS need to know IPs of servers
    - Internal Facing → Application logic need to know IPs of servers
- Replication → improve reliability and fault tolerance
  - Single Leader, Multi Leader, Leaderless
    - Single Leader Replication → writes/reads go to leader, reads go to followers (read-replicas), suitable for read-heavy system
    - Multi Leader Replication → both leaders serves reads and writes and coordinate with each other on writes
    - Leaderless Replication
  - Synchronous, Asynchronous, Semi-Synchronous
    - Synchronous → Strong Consistency
    - Asynchronous → Eventual Consistency
Datastores
- Relational / SQL → structured data, stored in tables/rows/columns, has primary key(s) and foreign key(s), able to perform joins between tables, support transactions, ACID-compliant, require atomic reads/writes, normalization
- Non-relational / NoSQL → semi-structured and unstructured data, BASE-compliant, denormalization
  - Key-value → like a HashMap
    - In-Memory Key-value
  - Document → similar to JSON
  - Wide Column → use tables, rows and columns but names and format of the columns can vary from row to row in the same table
  - Graph → model relationships, use nodes (entities) and edges (relationships), think of Adjacency List, Neo4j
Data Consistency Models
- Trade-off between Consistency and Availability
- ACID (Atomicity, Consistency, Isolation, Durability) → associated with relational DB, support transactions and data integrity features, need strict consistency and isolation
  - Offer stronger consistency guarantees but may be less available
- BASE (Basically Available, Soft state, Eventually consistent) → associated with non-relational DB, easily horizontally scalable, need high availability and scalability
  - Offer weaker consistency guarantees but is more available
Data Replication → process of making multiple copies of data and storing them on different servers, improves Availability and Durability of data
Sharding or Data Partitioning → process of distributing data across a set of servers, improves the scalability and performance of the system
- Horizontal Partitioning / Sharding→ store rows in different nodes, use some key to partition data, key is an attribute of the data
  - Range-based → ranges of a key
  - Hash-based → pass key to hash function
  - Directory-based → Lookup table
  - Custom-based
- Vertical Partitioning → store columns in different nodes
Hashing → hashing function that maps key to value
Consistent Hashing → Distributed Hash Table (DHT) algorithm, only n / m keys need to be remapped when a node gets added/removed (n = no. of keys, m = no. of slots)
Database Indexes
Proxies
- Reverse Proxy → intermediary between Web Servers and the Internet
- Forward Proxy → intermediary between clients and the Internet
Load Balancing/Balancer → distribute traffic across cluster of servers (traffic police)
- Redundant LB (active/passive) → eliminate single Point of Failure (PoF)
  - Traffic Distribution
    - Perform Health Checks
    - Naive: Round Robin, Weighted Round Robin
    - Better: Least Response Time
- Location: Between
  - User and Web Server
  - Web Server and Internal Platform Layer (Application Servers or Cache Servers)
  - Internal Platform Layer and Database
Cache → store data temporarily in-memory
- Time to Live (TTL)
- Cache Invalidation
  - Write-through Cache
  - Write-around Cache
  - Write-back Cache
- Cache Eviction Policies
  - Least Recently Used (LRU)
- Content Delivery Network (CDN) → geographically distributed cache servers, push content closer to the users, improve performance by reducing latency (shortened physical distance)
  - Push CDN → push content to CDN whenever changes occur on your server
  - Pull CDN → grab new content from your server
Network Protocols → help transmit data over the internet
- Transmission (TCP) → transport-layer protocol, ensure that data is delivered to its destination without errors and in the correct order
- User Datagram Protocol (UDP) → transport-layer protocol, does not guarantee the delivery of data or the order in which it is delivered, used for real-time applications (online gaming or voice over IP), low latency is more important than reliability
- Hypertext Transfer Protocol (HTTP) → application-layer protocol, used to transfer data over the web
  - HTTP1 → stateless protocol, request-response model, text-based encoding, built on top of TCP
  - HTTP2 → similar to previous version but improved: supports multiplexing (able to send multiple requests over a single connection), binary encoding, support server push, header compression
  - HTTP3 → stateful protocol, based on UDP, new binary encoding (QUIC), server push
Communication Protocols
- Polling → AJAX/Fetch requests at regular intervals
- Long Polling → like Polling, but server holds on to the connection longer until new data is available (timeout if too long without response)
- WebSockets → full duplex (bidirectional) connection over TCP, establish “Websocket handshake” between client and server
- Server Sent Events → client establishes a persistent and long-term connection with server, server sends real-time data to client (unidirectional)
Architecture Patterns
- Monolith → deploy as a single, large, stand-alone application (single codebase)
- Microservices → deploy application as a collection of small, independent services that communicate with each other over the network
- Event-Driven Architecture (EDA) → events to trigger and communicate between decoupled services, consist of Event Producers, Event Routers (Message Brokers), and Event Consumers (Workers)
  - Serverless → Functions that respond to requests, event-driven
Components of Web-based Distributed System
- Clients → Mobile, Desktop, etc.
- Domain Name System (DNS) → map domain names to IP addresses
- Content Delivery Network (CDN) → geographically distributed, store static content
- API Gateway → authentication/authorization, route to proper services
  - Load Balancer → traffic police
- Web Server(s) → handle HTTP requests and responses only
- Message Queue → store and transmit requests for further processing by Application Server(s)
- Application Server(s) → execute business logic to application programs through any number of protocols
- File System → store and manage files
- Cache Server(s) → temporary, store hot DB rows in memory
- Database Server(s) → read and write data to disk
Popular Services
- Relational Database → Amazon RDS/Aurora, Azure SQL Database, Google Spanner
- Non-relational Database → Amazon DynamoDB, Azure CosmosDB, Google Datastore
- Object Storage → AWS S3, Azure Blob Storage, Google Cloud Storage
- Cache (in-memory datastore) → Amazon ElastiCache, Azure Cache, Google Memorystore
  - Redis, Memcached
- CDN → Amazon CloudFront, Azure CDN, Google CDN
- Serverless → AWS Lambda, Azure Functions, Google Cloud Functions

Basics

“Everything is a trade-off” #

“Everything is a trade-off”