Key Steps
On this page
Steps
- Requirements / Clarifications / Minimum Viable Product (MVP) / Goals
- Functional Requirements
- Basic functionalities of the app
- Non-functional Requirements
- Availability, Consistency, Latency, Scalability…
- High Availability, High Reliability, Low Latency, Highly Scalable
- Can consistency take a hit in favor of availability/lower latency?
- Low latency in …
- Extended Requirements (recommended if you have more time)
- Functional Requirements
- Estimation and Constraints / Back of the Envelope Calculation / Rough Estimates
- Read heavy? Write Heavy? Read to Write Ratio
- System will focus on the more common
- Traffic
- Total Users
- Daily Active Users (DAU)
- Queries Per Second (QPS) / Requests Per Second (RPS)
- Bandwidth (manage traffic and balance load between servers)
- Storage
- Memory / Cache
- Read heavy? Write Heavy? Read to Write Ratio
- Data Model Design / Database Design
- Relations
- Type of Database
- Is the data relational? Require joins?
- Schema / Table
- API Design
- Function Signatures
- High Level Component Design
- Identify components that are needed
- API Gateway, Load Balancers, Multiple Application Servers
- Separate read and write servers
- Datastores
- Database
- Distributed File Storage System (photos and videos) / CDN
- Identify components that are needed
- Detailed Component Design
- No right answer, consider trade-offs
- How will we partition our data to distribute it to multiple databases?
- How do we handle hot active users?
- Storing most recent data, should we store our data in such a way that is optimized for scanning latest data
- How much and at which layer should we introduce caching?
- What components need better load balancing?
- No right answer, consider trade-offs
- Identify and Resolve Bottlenecks
- Is there a Single Point of Failure
- Enough Data Replication?
- Enough Copies of Services?
- Performance Monitoring
- Is there a Single Point of Failure
Universal Tricks
- Introduce Cache
- For read-heavy system
- Reduce load on database
- Multiple instances and replicas of our globally distributed cache
- Redis / Memcached
- Cache Eviction: LRU
- Pareto Principle: 80-20 rule
- CDN for static assets
- Geographically distributed
- Address latency
- Cache Eviction: LRU
- Pareto Principle: 80-20 rule
- Redundancy and Replication
- Introduce Load Balancers
- For horizontal scaling
- Consistent Hashing - useful strategy for distributed caching system and distributed hash tables
- Initially, RR then something Dynamic
- Consistent Hashing
- Uniformly distribute requests among different nodes such that we should be able to add or remove nodes with minimal effort
- High Reliability
- Multiple copies
- Multiple Datastores
- Relational Database
- Sharding / Horizontal Partitioning
- Multiple Read Replicas (part of handling heavy reads)
- Non-relational Database
- Easily scalable (horizontal scaling)
- Relational Database
- Relational Database
- Read-heavy system
- Indexing for faster search
- makes columns faster to query by creating pointers to where data is stored within a database
Popular Services
- SQL
- Azure SQL Database (MySQL, PostgreSQL)
- NoSQL
- Apache Cassandra
- Amazon RDS
- Google Cloud Datastore
- Key-Value Store
- Amazon DynamoDB
- Key
- Object Store
- Amazon S3 (Simple Storage Service)
- Azure Blob Storage
- Google Cloud Storage
- Graph Database
- Neo4j
- CDN
- Amazon CloudFront
- Azure CDN
- Google CDN
- Cache
- Redis
- Memcached
- Search
- ElasticSearch
Not commonly talked about
- Security
Other Outlines: https://github.com/jguamie/system-design/blob/master/notes/system-design-outline.md
Miscellaneous
1 byte = 8 bits
1 KB = 10^3 byte
1 MB = 10^6 byte
1 GB = 10^9 byte
1 TB = 10^12 byte
1 PB = 10^15 byte
1 EB = 10^18 byte
In UTF-8, 1 char can range from 1-4 bytes