The Interview Where I Had to Build Instagram From Scratch

I had a system design interview where the prompt was basically:
“Design Instagram from scratch. Handle feeds, media uploads, caching, high-traffic accounts, and ensure that when someone unfollows another user, their photos immediately become non-viewable.”
Here’s exactly how I approached it as a developer and engineering manager — clean, technical, and to the point.
1. Requirements I Clarified
Functional
- Users can create accounts, follow/unfollow, post photos, view feeds.
- Photos and profiles must reflect privacy controls.
- When user A unfollows user B, A instantly loses access to B’s posts.
Non-functional
- Heavy read volume.
- Very heavy write bursts (from celebrities).
- Low latency feed (<300ms).
- Strong access control for private posts.
- High availability, horizontal scale.
2. High-Level Architecture
+--------------------+
Client --> | API Gateway |
+---------+----------+
|
+----------------------+------------------------+
| | | | |
+---v---+ +----v----+ +----v----+ +---v---+ +------+
| Auth | | Profile | | Follow | | Feed | | Media |
| Svc | | Svc | | Graph | | Svc | | Svc |
+---+---+ +----+----+ +----+----+ +---+---+ +--+---+
| | | | |
| +---v-----+ +---v-----+ +--v-----+ |
| | User DB | | Follow | | Feed DB| |
| +---------+ | EdgesDB | +--------+ |
| +---------+ |
| |
| +-------------------------------v------+
| | Object Storage + CDN (Photos) |
| +--------------------------------------+
3. Caching Strategy (Core to Instagram-Scale Systems)
Instagram lives on caching. I made this explicit in every layer.
Cache Layers I Defined
- Feed Cache (critical)
user_feed:{user_id}- TTL ~ 30–60 seconds
- Stored in Redis / Memcached
- Invalidated when:
- new post arrives
- user follows/unfollows someone
- privacy settings change
- User Profile Cache
- username, follower count, bio
- Heavy read, low write
- Cached aggressively with TTL 3–10 minutes
- Follow Graph Cache
following:{user_id}= list of users they followfollowers:{user_id}= list of users following them- Stored in memory (Redis) for fast fan-out and auth checks
- On follow/unfollow:
- update DB
- update cache
- push invalidation event
- Post Metadata Cache
- Post captions, timestamps, media references
- Small objects, ideal for Redis
- TTL ~ 24h (safe to keep for performance)
- Media Cache (CDN)
- Real images cached at the CDN edge
- Origin fetch is rare
- Signed URLs protect access
Caching is not optional — Instagram cannot function without it.
4. Posting a Photo (Flow + Cache Interaction)
Client -> API Gateway -> Media Service -> Object Storage/CDN
|
v
Post Service -> DB -> Feed Service
|
v
Feed Write + Cache Invalidations
Steps
- Media Service gives the client a pre-signed upload URL.
- Client uploads image directly to object storage (S3-like).
- Client sends metadata (
POST /posts) to Post Service. - Post stored in DB → event emitted.
- Fan-out worker updates followers’ feeds.
- Invalidate feed cache for affected users:
- Delete
user_feed:{id}for each follower.
- Delete
This ensures follower feeds show new posts immediately.
5. Feed Generation — With Caching
Home Feed Read
Flow:
Client -> API Gateway -> Feed Service -> (Cache first)
Algorithm:
- Check feed cache:
- If hit → return cached feed. (fast)
- If miss → rebuild feed:
- Read from
feedstable - Merge with “celebrity accounts” real-time posts
- Read from
- Store result:
SET user_feed:{id} <feed> TTL=60s
- Return feed.
6. Fan-Out Strategy (Normal vs High Traffic Accounts)
Normal users
Use fan-out on write:
New Post -> push post_id into each follower’s feed list
Fast reads, moderate writes.
High-traffic / Celebrity accounts
Use fan-out on read:
- Don’t push posts to millions of followers.
- Store posts only in
poststable. - On feed read:
- Merge cached feed with celebrity posts.
Diagram:
Normal: Fan-out on write -> Feed table
Celebrity: Fan-out on read -> Query on read + merge
7. Unfollow Logic + Security: Instantly Blocking Content
This was one of the key interview points:
What happens when A unfollows B?
The system must make B’s images non-viewable to A.
The steps I defined:
7.1 Follow Graph Update
When A unfollows B:
DELETE FROM follows WHERE follower_id=A AND followee_id=B
Then:
- Delete from follow graph cache:
following:Afollowers:B
Emit unfollow event:
unfollow(A, B)
7.2 Feed Cleanup
Background worker removes B’s posts from A’s feed:
DELETE FROM feeds WHERE user_id=A AND post_owner=B
Then:
DEL user_feed:{A} // Clear cached feed
Next time A loads their feed → cache is rebuilt → B is gone.
7.3 Media Authorization — The Final Gate
Even if A somehow still has old URLs to B’s photos:
- URLs are signed with short TTL.
- Client requests
/media/{post_id}. - Media Service checks:
if user_is_allowed(A, post_owner=B) == false:
deny access
Why critical?
Because images are cached globally at the CDN edge.
Authorization must occur on every media request, not just feed load.
No follow = no signed URL granted = no image visible.
8. Protecting Private Photos
If B has a private account:
- Only followers can request signed URLs.
- Feed Service enforces read-time checks.
- Media Service enforces access control before signing URLs.
- Follow Graph cache ensures checks are fast.
If A unfollows B → B’s account becomes private to A.
9. High-Traffic Scenarios & Caching Stabilizers
To handle celebrities or viral moments, I emphasized:
Anti-Thundering-Herd Techniques
- Staggered cache expiry (jitter).
- Lock-based cache rebuilds:
- Prevent multiple servers rebuilding the same feed at once.
- Write-through caching for profiles.
- CDN for all images/video.
- Sharded DBs for:
- Posts
- Feeds
- Follows
Hot User Protection
If Beyoncé posts:
- Post Service writes metadata.
- Celebrity posts don’t fan out.
- Feed Service merges them on read.
- Cache per user remains stable.
This prevents millions of writes per post.
10. Full System Diagram with Caching
+--------------------+
| API Gateway |
+---------+----------+
|
[Auth + Rate Limits]
|
+-------------+-------------+
| | |
+------v-----+ +-----v------+ +----v------+
| Profile Svc| | Follow Svc | | Feed Svc |
+------+-----+ +------+-----+ +-----+-----+
| | |
| | |
+-------v---+ +------v----+ +----v-------+
| Profile DB| | Follow DB | | Feed DB |
+-----------+ +-----------+ +------------+
| | |
| [Redis Cache] |
| |
+-----------v----------+
| Media Service |
| (Signed URLs) |
+-----------+----------+
|
+-------v-------+
| CDN Layer |
+-------+-------+
|
+-------v-------+
| Object Storage|
+---------------+
Caching sits at:
- Feed Service
- Follow Service
- Profile Service
- CDN
- Media Service access layer
11. Summary of My Interview Approach
What I communicated, step-by-step:
- Decompose the system into services.
- Define data models that scale.
- Describe hybrid fan-out strategy.
- Introduce caching everywhere:
- Feed cache
- Profile cache
- Follow graph cache
- Post cache
- CDN media cache
- Explain authorization logic:
- Follow graph is source of truth
- Signed URLs
- No “zombie” access
- Describe unfollow security:
- Remove feed items
- Invalidate cache
- Deny media access
- Handle high-traffic accounts separately.
- Cover security, rate limiting, and resilience strategies.