Scaling Pinterest - Next Stage

Scaling Marty Weiner Smurf Village Yash Nelapati Asgard Next Stage
Thursday, April 25, 13

Pinterest is... An online pinboard to organize and share what
inspires you. Thursday, April 25, 13

Growth March 2010 Page Views / Day Mar 2010 Jan
2011 Jan 2012 May 2012 Thursday, April 25, 13

Growth March 2010 · RackSpace · 1 small Web Engine
· 1 small MySQL DB · 1 Engineer (3 Total) Page Views / Day Mar 2010 Jan 2011 Jan 2012 May 2012 Thursday, April 25, 13

Growth September 2011 Page Views / Day Mar 2010 Jan

Growth September 2011 Page Views / Day Mar 2010 Jan
2011 Jan 2012 May 2012 · Amazon EC2 + S3 + CloudFront · 2 NGinX, 16 Web Engines + 2 API Engines · 5 Functionally Sharded MySQL DB + 9 read slaves · 4 Cassandra Nodes · 15 Membase Nodes (3 separate clusters) · 8 Memcache Nodes · 10 Redis Nodes · 3 Task Routers + 4 Task Processors · 4 Elastic Search Nodes · 3 Mongo Clusters · 3 Engineers (8 Total) Thursday, April 25, 13

It will fail. Keep it simple. Thursday, April 25, 13

Growth April 2012 Page Views / Day Mar 2010 Jan

Growth April 2012 Page Views / Day Mar 2010 Jan
2011 Jan 2012 May 2012 · Amazon EC2 + S3 + Edge Cast · 135 Web Engines + 75 API Engines · 10 Service Instances · 80 MySQL DBs (m1.xlarge) + 1 slave each · 110 Redis Instances · 60 Memcache Instances · 2 Redis Task Manager + 60 Task Processors · Sharded Solr · 15 Engineers (25 Total) Thursday, April 25, 13

April 2013 Architecture Thursday, April 25, 13

Growth April 2013 Page Views / Day April 2012 April
2013 Thursday, April 25, 13

· Amazon EC2 + S3 + Edge Cast · 300
Web Engines + 400 API Engines · 69 MySQL DBs (hi.4xlarge on SSDs) + 1 slave each · 100+ Redis Instances · 230+ Memcache Instances · 7 Redis Task Manager + 500 Task Processors · 70+ Engineers (130+ Total) Growth April 2013 Page Views / Day April 2012 April 2013 Thursday, April 25, 13

Growth April 2013 Page Views / Day April 2012 April
2013 · 6 services (80 instances) · Sharded Solr · 20 HBase · 12 Ka a + Azkabhan · 8 Zookeeper Instances · 12 Varnish Thursday, April 25, 13

April 2013 Architecture Thursday, April 25, 13

Data Flow Thursday, April 25, 13

April 2012 Pinployees • 12 Engineers • 1 Data Infrastructure
• 1 Ops • 2 Mobile • 8 Generalists Thursday, April 25, 13

April 2012 Pinployees • 12 Engineers • 1 Data Infrastructure
• 1 Ops • 2 Mobile • 8 Generalists April 2013 • 65 Engineers • 7 Data Infrastructure + Science • 7 Search and Discovery • 9 Business and Platform • 6 Spam, Abuse, Security • 9 Web • 9 Mobile • 2 growth • 10 Infrastructure • 6 Ops Thursday, April 25, 13

Technologies Thursday, April 25, 13

• Amazon • Python, Java, Go • MySQL • Memcache
• Redis • HBase Thursday, April 25, 13

If you’re the biggest user of a technology, the challenges
will be greatly amplified Thursday, April 25, 13

Why Amazon? Hosting • When? Beginning • Very good peripherals,
such as load balancing, DNS, map reduce, and more... • New instances ready in seconds Thursday, April 25, 13

Why Amazon? Hosting • When? Beginning • Very good peripherals,
such as load balancing, DNS, map reduce, and more... • New instances ready in seconds When to move to a datacenter? • Once you’re consistently hi ing issues beyond your control Thursday, April 25, 13

Why Python? Code • Extremely mature • Well known and
well liked • Solid active community • Very good libraries speciﬁcally targeted to web development • Eﬀective rapid prototyping Thursday, April 25, 13

Why Not Python? Code • Interpreted • Global Interpreter Lock
• Primitive GC • Alternatives: Java, Go Thursday, April 25, 13

Why MySQL and Memcache? Production Data • Extremely mature •
Well known and well liked • Rarely catastrophic loss of data • Response time to request rate increases linearly • Very good soware support - XtraBackup, Innotop, Maatkit • Solid active community • Free Thursday, April 25, 13

Why Redis? Production Data • Well known and well liked
• Consistently good performance • Free • Variety of convenient and eﬃcient data structures • Insert into queue in O(1) • 3 Flavors of Persistence: Now, Snapshot, Never • For HIGH write:read ratio, snapshot saves a lot I/O bandwidth • Snapshot increases reliability in noisy environments Thursday, April 25, 13

Why HBase? (or, Why NOT MySQL, Redis, Memcache) Production Data
• Eﬃcient storage • Can handle large write thoughput • Solid Hadoop interface • Maturing quickly, used heavily by Facebook • Built on HDFS • Free • When? Use it to optimize your already mature system Thursday, April 25, 13

Challenges Thursday, April 25, 13

• Employee Growth • Data Data Data • Abuse Protection
• Uptime and Latency • Connections Thursday, April 25, 13

Challenge: One Codebase + Lots of Engineers = Deploy Hell
Employee Growth • Major bugs and performance issues stall deploys • Performance issues creep in under radar • 7+ development teams, 1 ops team • Workload changing more rapidly and less predictably • Want developers to not fear moving fast Thursday, April 25, 13

Challenge: One Codebase + Lots of Engineers = Deploy Hell
Employee Growth • Major bugs and performance issues stall deploys • Performance issues creep in under radar • 7+ development teams, 1 ops team • Workload changing more rapidly and less predictably • Want developers to not fear moving fast Challenge: Maintain Fast Flexible Experimentation • Want to empower engineers and PMs to try new things Thursday, April 25, 13

Solution: Deploy Checkpoints Employee Growth • A gressive unit tests
(careful! don’t erase your DB!) • Rings of deployment • Canary, employees only, 5% of user base, etc. • Continuous deployment • Production integration tests Thursday, April 25, 13

Solution: Services Employee Growth • Move away from a monolithic
code base and topology • When? 50 engineers or too many connections • Empower each team • Service architecture with metrics and alerts • Conﬁgurable deployment • Ability to add capacity • Convenient and consistent data storage and caching • Provide Reliable Business and Ops Data • Win: Protect your database from accidents (e.g., unit test dropping DB tables) Thursday, April 25, 13

Challenge: Provide Reliable Business and Ops Data Data Data Data
• Business relies more heavily on data • Need reliable metrics to run successful experiments Thursday, April 25, 13

Solution: Use What's Available Data Data Data • Google Analytics
• When? Day 1 • S3 + Amazon’s EMR • When? When you start needing to dig deeper • You’ll need a data lo ging pipeline Thursday, April 25, 13

Solution: Build a Reliable Data Pipeline Data Data Data •
Example: Flume, Scribe, Kaa • Get data from business logic to map reduce • Beneﬁts: • Track trends • Understand what’s actually going on • Recover from database mishaps • What do you log? All Requests? All Events? Individual types of Events? Thursday, April 25, 13

Solution: Build a Reliable Data Pipeline Data Data Data •
Example: Flume, Scribe, Kaa • Get data from business logic to map reduce • Beneﬁts: • Track trends • Understand what’s actually going on • Recover from database mishaps • What do you log? All Requests? All Events? Individual types of Events? • Answer: All of the above Thursday, April 25, 13

Challenge: Spam and Abuse Abuse Protection • Abusive content •
Hĳacking • Application Security • DDOS / Scraping • Spam • Each ﬂavor has a diﬀerent set of actors with unique motives and behavior Thursday, April 25, 13

Solution: Spam Detection and Prevention Abuse Protection • Spammers... •
are human • know your product and demographics as well as you do • know your defenses very well • are generally more tech savvy than your users • will grow with you • want to make money • If spammers are not making a good ROI, they’ll go away • Always communicate blocks as if the receiver is a good user Thursday, April 25, 13

Challenge: Increase Availability, Decrease Latency Uptime and Latency • Push
for be er uptime and lower latency • Initially, most uptime and latency issues due to DB + caching • Fewer Instances => Few, but big failures • More Instances => More smaller failures + more complexity • How a gressively can you retry without hurting the system? Thursday, April 25, 13

Solution: Metrics Dashboard and Alerts Uptime and Latency • Create
dashboard + alerts, and review response times weekly • When? Soon aer launch at latest • Proﬁle everything • MySQL - Maatkit, InnoTop • Memcache - Maatkit • Frontend - New Relic • General Ops - StatsD, Nagios / Monit, Ganglia Thursday, April 25, 13

Solution: Configuration Manager and Failover Uptime and Latency • Provides
load balancing and automatic connection reconﬁguration • When? 30+ caches / DBs • One option: Intermediate load balancers • Example: HAProxy, NGinx, Varnish • Extra latency hop • More complication • Conﬁguration hassle (1 LB / 7 services?) Thursday, April 25, 13

Solution: Zookeeper Co-ordination • Centralized conﬁguration management • Used for
service discovery • Notiﬁes of service failures • WATCH and its callback are pre y reliable • Experiment framework Thursday, April 25, 13

service discovery • Notiﬁes of service failures • WATCH and its callback are pre y reliable • Experiment framework Zookeeper Services app Register Thursday, April 25, 13

service discovery • Notiﬁes of service failures • WATCH and its callback are pre y reliable • Experiment framework Zookeeper Services app Register WATCH Thursday, April 25, 13

Part 1: Configuration Manager and Failover MySQL Failover A B
App Zookeeper {“master” : “A”} readonly=True Thursday, April 25, 13

App Zookeeper {“master” : “B”} readonly=True Thursday, April 25, 13

App Zookeeper {“master” : “B”} readonly=False Thursday, April 25, 13

Memcache Failures App Nutcracker Cache 001 Cache 002 Cache 003
Cache 004 Cache 005 Thursday, April 25, 13

Memcache Failures App Nutcracker Cache 001 Cache 002 Cache 003
Cache 004 Cache 005 Ketama ring adjusted Thursday, April 25, 13

Solution: Instance Configuration Uptime and Latency • Example: Puppet •
Systems to auto- and re- conﬁgure your instances • Makes it easier to spin up more capacity or replacements • When to use? • Once you begin to have clear server instance segmentation • ~15+ instances • Earlier is be er -- you’ll want all your instances Puppet-iﬁed Thursday, April 25, 13

Challenge: Number of Connections Rising Connections • Initially, entire app
tier connected to all Memcache, Redis, MySQL • On Memcache... • 20k connections * 10kB / connection = 195MB / Memcache • 40 Memcaches means 7.6 GB used on connections • Connection space is not allocated from slab memory! • Can eventually cause Memcache process to leak into swap • On MySQL • At least 256 kB / connection Thursday, April 25, 13

Challenge: Number of Connections Rising Connections • On Redis... •
Max number of connections allowed is 10240 (weird...) • Exceeding max connections will make Redis CPU peg at 100% • On Ubuntu 12.04, default max connections is 1024 (!!) • (Go change to 65536 now) Thursday, April 25, 13

Solution: Connection Pooling and Multiplexing Connections • Data Services, Nutcracker
• When? Once any service gets close to 10k connections • Success: Memcache • Once was >20k connections • Now 1.3k connections • But, a gressive fan-out causes... • Network contention • Incast congestion Thursday, April 25, 13

Finagle Why Java over Python • RPC for high concurrency
• Twi er • Completely asynchronous • Previous experience with Finagle • Lots of compatible libraries • JVM • Lots of bells and whistles - Ostrich, Zipkin, lago Thursday, April 25, 13

Near Term Challenges What’s Next? • Continually improve deployment mechanisms
• More a gressively push toward services maintained by teams • Growing beyond 130 Pinployees • Build products faster • MySQL 5.6 • Aer that? I don’t know... Thursday, April 25, 13

Questions? [email protected] [email protected] Thursday, April 25, 13

Appendix Thursday, April 25, 13

Redis Configuration Tips • Challenges • BGSAVE forks the main
Redis process, potentially doubling RAM usage • If one Redis instance is using > 70% of RAM, you’re in danger • Redis is single threaded • Solution • Run 32 Redis instances on a single host (port 6379, 6380, ...) • Can move some of those instances to new host to add capacity • If you conﬁgure each instance with multiple databases, you can split even more times. • Utilizes more cores Thursday, April 25, 13

Redis Configuration Tips • Also... • Max number of connections
allowed is 10240 • Can only be overcome by editing source and recompiling Thursday, April 25, 13

Increasing connections in Ubuntu 12.04 Tips • As root... •
In /etc/sysctl.conf fs.file-max = 65536 • In /etc/security/limits.conf * soft nofile 65536 * hard nofile 65536 • In ~/.bashrc ulimit -n 65536 Thursday, April 25, 13

MySQL config Tips • Protect yourself from whereless UPDATEs !
• Example: UPDATE users SET ﬁrst_name = “Bob”; • Now all your users are named Bob! • Add sql_safe_updates=1 to my.cnf Thursday, April 25, 13

Scaling Pinterest - Next Stage

Scaling Pinterest - Next Stage

More Decks by Yash Nelapati

Other Decks in Technology

Featured

Transcript