Video Streaming Platform

Notes from grokking system design

Services: Youtube, Netflix

Requirements

Functional
- Upload, view and share videos
- Search videos from titles/description
- Record stats (likes, number of views)
- Comment on videos
Non-functional
- High Availability
  - Consistency can take a hit in the interest of availability/lower latency, fine if an uploaded video takes a while to be available
- High Reliability (uploaded videos are never lost)
- Minimal Latency (while watching videos)

Capacity Estimation / Constraints

Read-heavy system Read (View) to Write (Upload) Ratio 200:1
Traffic
- Total Users: 1.5 B
- Daily Active Users (DAU): 800 M
- Assume: A user watches 5 videos / day
  - (Read) 800 M DAU * 5 videos / day * 1 day / 86400 s = 46 K videos / s
  - (Write) 46 K videos / s * 1/200 = 230 videos / s
Storage
- Assume
  - 500 hr worth of videos are uploaded / min
  - On average, 1 min video needs 50 MB storage (videos need to be stored in multiple formats)
    - 500 hr of videos uploaded / min * 60 min / hr * 50 MB = 1500 GB / min = 25 GB / s
      - Ignored video compression and replication
Bandwidth - 10 MB / min
- (Write / Ingoing) 500 hr of videos uploaded / min * 60 min / hr * 10 MB / min = 300 GB / min = 5 GB / s
- (Read / Outgoing) 200 * 5 GB / s = 1 TB / s

System APIs

upload_video(api_dev_key, video_contents, video_title, video_description, tags[], category_id, default_language, recording_details)
- Parameters:
  - api_dev_key (string): limit users
  - video_contents (stream): video to be uploaded
  - video_title (string)
  - video_description (string)
  - tags (string[])
  - category_id (string)
  - default_language (string)
  - recording_details (string): Location info
- Returns:
  - result (JSON):
    - status: successful upload returns HTTP 202 (request accepted), if successful, can send email notification as well with link
    - link: access to the video
search_video(api_dev_key, search_query, user_location, maximum_videos_to_return, page_token)
- Parameters:
  - api_dev_key (string)
  - search_query (string): search term
  - user_location (string)
  - maximum_videos_to_return (number)
  - page_token (string): specify a page in the result set that should be returned
- Returns:
  - result (JSON): contains information about the list of video resources matching the search query
    - videos (array): each has video title, a thumbnail, a video creation date, and a view count
stream_video(api_dev_key, video_id, offset, codec, resolution)
- Parameters:
  - api_dev_key (string)
  - video_id (string)
  - offset (number): time offset in seconds from the beginning of video
  - codec (string) & resolution (string): to support play/pause from multiple devices
- Returns: stream
  - Media stream (video chunk) from given offset

Database Schema

Video Metadata - SQL
- Video
  - VideoID (PK), Title, Description, Size, Thumbnail, Uploader/UserID, TotalNumberOfLikes, TotalNumberOfViews
- Comment
  - CommentID (PK), VideoID (FK), UserID (FK), Comment, CreationDate
User Metadata - SQL
- UserID (PK), Name, Email, Address, Age, RegistrationDetails

High Level Design

Processing Queue - videos to be dequeued for encoding, thumbnail generation and storage
Encoder - encode uploaded videos into multiple formats
Thumbnails Generator
Video and Thumbnail Storage - distributed file storage / object storage
User Datastore
Video Metadata Datastore

Detailed Component Design

Read-heavy system - more views than uploads
Video stored using a Distributed File Storage System (HDFS) or Distributed Object Storage (S3)
- Maintain multiple copies of videos
Microservices Architecture - separate the services, scale them independently
- Read Service
  - LB - distribute traffic to different servers since we have multiple copies of videos
- Write Service
  - Single Leader Replication
    - Can cause some staleness in data, but still acceptable (takes some milliseconds to update)
  - Video Uploads - Support resuming from the same point if connection fails, store uploaded videos on server
  - Video Encoding - New task added to processing queue to encode video into multiple formats
Thumbnails
- Read-heavy - Higher thumbnail views than video views
  - Watching one video, there can be multiple other thumbnails of other videos
  - Each thumbnail is small in size (5-10 KB)
- Store in Bigtable
  - Combines multiple files into one block to store on the disk
  - Efficient in reading a small amount of data
- Cache hot thumbnails to improve latencies

Metadata Sharding

Based on UserID - store all data for a specific user on one server
- Pass UserID to hash function which will map the user to a database server
- Query videos for a particular user will be straightforward, pass UserID to the hash function
- To search videos by titles, query all servers and each server will return a set of videos, a centralized server will then aggregate and rank these results before returning them to the user
- Issues
  1. Performance bottleneck if a user becomes popular (a lot of queries on the server holding the user)
  2. Nonuniform distribution of data as some users may be storing a lot more that others
- Solution - use Consistent Hashing to balance the load between servers or repartition/redistributed the data
Based on VideoID - hash function map each VideoID to a random server where we will store that video’s metadata
- To find videos of a user, query all servers and each server will return a set of videos, a centralized server will then aggregate and rank these results before returning them to the user
  - This approach solves our problem of popular users but shifts it to popular videos
- Improve performance by caching hot videos

Video Deduplication

Perform depuplication early in the process
Run matching algorithms (Block Matching, Phase Correlation, etc.) to find video duplications
Replace existing video if
- If new video higher quality
- If new video is longer, or existing video is subpart of new video
  - Intelligently divide the video into smaller chunks, only upload the new parts

Load Balancing

Use Consistent Hashing among cache servers

Cache & Content Delivery Network

Cache hot database rows for metadata servers
- Eviction Policy: LRU, discard least recently viewed row first
Pareto Principle: 80-20 rule
- 20% is generating 80% of traffic
- Cache the 20%
CDN - Geographically distributed cache servers
- Move popular videos to CDNs
  - CDNs replicate content in multiple places - videos will stream from a friendlier network
  - CDN machines make heavy use of caching and can mostly serve videos out of memory

Fault Tolerance

Use Consistent Hashing for distribution among DB servers
- Help replace dead servers
- Help distribute load among servers

Other Details

Youtube was originally built using a relational database (MySQL)
- Discovered new ways to scale it, resulted in Vitess

Video Streaming Platform

Requirements #

Capacity Estimation / Constraints #

System APIs #

Database Schema #

High Level Design #

Detailed Component Design #

Metadata Sharding #

Video Deduplication #

Load Balancing #

Cache & Content Delivery Network #

Fault Tolerance #

Other Details #

Requirements

Capacity Estimation / Constraints

System APIs

Database Schema

High Level Design

Detailed Component Design

Metadata Sharding

Video Deduplication

Load Balancing

Cache & Content Delivery Network

Fault Tolerance

Other Details