Video Streaming Platform

Notes from grokking system design

Services: Youtube, Netflix

Requirements

  • Functional
    • Upload, view and share videos
    • Search videos from titles/description
    • Record stats (likes, number of views)
    • Comment on videos
  • Non-functional
    • High Availability
      • Consistency can take a hit in the interest of availability/lower latency, fine if an uploaded video takes a while to be available
    • High Reliability (uploaded videos are never lost)
    • Minimal Latency (while watching videos)

Capacity Estimation / Constraints

  • Read-heavy system Read (View) to Write (Upload) Ratio 200:1
  • Traffic
    • Total Users: 1.5 B
    • Daily Active Users (DAU): 800 M
    • Assume: A user watches 5 videos / day
      • (Read) 800 M DAU * 5 videos / day * 1 day / 86400 s = 46 K videos / s
      • (Write) 46 K videos / s * 1/200 = 230 videos / s
  • Storage
    • Assume
      • 500 hr worth of videos are uploaded / min
      • On average, 1 min video needs 50 MB storage (videos need to be stored in multiple formats)
        • 500 hr of videos uploaded / min * 60 min / hr * 50 MB = 1500 GB / min = 25 GB / s
          • Ignored video compression and replication
  • Bandwidth - 10 MB / min
    • (Write / Ingoing) 500 hr of videos uploaded / min * 60 min / hr * 10 MB / min = 300 GB / min = 5 GB / s
    • (Read / Outgoing) 200 * 5 GB / s = 1 TB / s

System APIs

  • upload_video(api_dev_key, video_contents, video_title, video_description, tags[], category_id, default_language, recording_details)
    • Parameters:
      • api_dev_key (string): limit users
      • video_contents (stream): video to be uploaded
      • video_title (string)
      • video_description (string)
      • tags (string[])
      • category_id (string)
      • default_language (string)
      • recording_details (string): Location info
    • Returns:
      • result (JSON):
        • status: successful upload returns HTTP 202 (request accepted), if successful, can send email notification as well with link
        • link: access to the video
  • search_video(api_dev_key, search_query, user_location, maximum_videos_to_return, page_token)
    • Parameters:
      • api_dev_key (string)
      • search_query (string): search term
      • user_location (string)
      • maximum_videos_to_return (number)
      • page_token (string): specify a page in the result set that should be returned
    • Returns:
      • result (JSON): contains information about the list of video resources matching the search query
        • videos (array): each has video title, a thumbnail, a video creation date, and a view count
  • stream_video(api_dev_key, video_id, offset, codec, resolution)
    • Parameters:
      • api_dev_key (string)
      • video_id (string)
      • offset (number): time offset in seconds from the beginning of video
      • codec (string) & resolution (string): to support play/pause from multiple devices
    • Returns: stream
      • Media stream (video chunk) from given offset

Database Schema

  • Video Metadata - SQL
    • Video
      • VideoID (PK), Title, Description, Size, Thumbnail, Uploader/UserID, TotalNumberOfLikes, TotalNumberOfViews
    • Comment
      • CommentID (PK), VideoID (FK), UserID (FK), Comment, CreationDate
  • User Metadata - SQL
    • UserID (PK), Name, Email, Address, Age, RegistrationDetails

High Level Design

  1. Processing Queue - videos to be dequeued for encoding, thumbnail generation and storage
  2. Encoder - encode uploaded videos into multiple formats
  3. Thumbnails Generator
  4. Video and Thumbnail Storage - distributed file storage / object storage
  5. User Datastore
  6. Video Metadata Datastore

Detailed Component Design

  • Read-heavy system - more views than uploads
  • Video stored using a Distributed File Storage System (HDFS) or Distributed Object Storage (S3)
    • Maintain multiple copies of videos
  • Microservices Architecture - separate the services, scale them independently
    • Read Service
      • LB - distribute traffic to different servers since we have multiple copies of videos
    • Write Service
      • Single Leader Replication
        • Can cause some staleness in data, but still acceptable (takes some milliseconds to update)
      • Video Uploads - Support resuming from the same point if connection fails, store uploaded videos on server
      • Video Encoding - New task added to processing queue to encode video into multiple formats
  • Thumbnails
    • Read-heavy - Higher thumbnail views than video views
      • Watching one video, there can be multiple other thumbnails of other videos
      • Each thumbnail is small in size (5-10 KB)
    • Store in Bigtable
      • Combines multiple files into one block to store on the disk
      • Efficient in reading a small amount of data
    • Cache hot thumbnails to improve latencies

Metadata Sharding

  • Based on UserID - store all data for a specific user on one server
    • Pass UserID to hash function which will map the user to a database server
    • Query videos for a particular user will be straightforward, pass UserID to the hash function
    • To search videos by titles, query all servers and each server will return a set of videos, a centralized server will then aggregate and rank these results before returning them to the user
    • Issues
      1. Performance bottleneck if a user becomes popular (a lot of queries on the server holding the user)
      2. Nonuniform distribution of data as some users may be storing a lot more that others
    • Solution - use Consistent Hashing to balance the load between servers or repartition/redistributed the data
  • Based on VideoID - hash function map each VideoID to a random server where we will store that video’s metadata
    • To find videos of a user, query all servers and each server will return a set of videos, a centralized server will then aggregate and rank these results before returning them to the user
      • This approach solves our problem of popular users but shifts it to popular videos
    • Improve performance by caching hot videos

Video Deduplication

  • Perform depuplication early in the process
  • Run matching algorithms (Block Matching, Phase Correlation, etc.) to find video duplications
  • Replace existing video if
    • If new video higher quality
    • If new video is longer, or existing video is subpart of new video
      • Intelligently divide the video into smaller chunks, only upload the new parts

Load Balancing

  • Use Consistent Hashing among cache servers

Cache & Content Delivery Network

  • Cache hot database rows for metadata servers
    • Eviction Policy: LRU, discard least recently viewed row first
  • Pareto Principle: 80-20 rule
    • 20% is generating 80% of traffic
    • Cache the 20%
  • CDN - Geographically distributed cache servers
    • Move popular videos to CDNs
      • CDNs replicate content in multiple places - videos will stream from a friendlier network
      • CDN machines make heavy use of caching and can mostly serve videos out of memory

Fault Tolerance

  • Use Consistent Hashing for distribution among DB servers
    • Help replace dead servers
    • Help distribute load among servers

Other Details

  • Youtube was originally built using a relational database (MySQL)
    • Discovered new ways to scale it, resulted in Vitess