QCON NYC 2014: Scaling Foursquare: from check-ins to recommendations.

This is a talk about how we scaled foursquare over
the years. And, by scale I mean designing our systems to handle more incoming requests, storing more data, and managing complexity. I’ll give you a bit of background on where we started and how we got to our current design. This story’s been told many @mes, by many diﬀerent people, and will con@nue be told several more @mes in this very room. So I’ll try to highlight some of the more unique approaches we’ve taken. 1

Foursquare is a recommenda@on engine for ﬁnding the best places
to go for things like food and drinks We also have a newly released loca@on app called swarm for keeping up and mee@ng up with your friends. Sub@tle of talk Based in NYC. Around 50 million users. 2

I’ll break this up into two parts. In
part one I’ll talk about scaling the data storage layer and in part two i’ll talk about scaling our codebase and tooling 3

Development of foursquare start 2009 ini@ally we were using
mysql, but quickly switched to postgres because it happened to have beQer support in the ORM library we were using. 4

We overcame ini@al scaling scaling challenges fairly easily (see how
conﬁdent those goats look) with bigger boxes for our postgres server and memcache in front of it to fully cache some types of objects. 5

But then things started to get a bit more challenging
as we got our ﬁrst million users in early 2010. 6

Soon we started to split some of the very large
and fast growing tables to their own dedicated servers and we switched from doing SQL joins to joining things in our applica@on code 7

We also sent some query traﬃc to read only replicas
that we already had set up for failover. 8

In early 2010, our checkin rate was growing exponen@ally
it was clear that in a few months we’d get to a point where that single table would get too large for the available memory or where the write rate would be too high for the available IOPS So we knew we’d have to split that data up across mul@ple machines 9

Spli`ng up a single type of data across mul@ple servers
is oaen referred to as sharding. And none of us had any prior experience doing that, but we knew that we could either built something ourselves on top of postgres or use something that purported to handle that for us. We knew that one a big challenge with sharding is rebalancing data as you add new nodes or as data sizes become unbalanced due to the way things are split. So our inclina@on was to not try to solve that problem ourselves We were already interested in Mongo because of the geo index features and schema ﬂexibility, and they also had sharding and autobalancing on their roadmap, but those features were not yet implemented. But we decided to take a gamble that Mongo would have stable sharding features by the @me we really needed them. 10

So it took around a year and half, but eventually
we got all of our data into mongo and it’s s@ll our system of record for all user data. That’s a screenshot of our real @me monitoring dashboard We’re currently running over 15 clusters with a over 600 replicas which handle over one million queries per second 11

Mongo is great for general purpose storage, but there are
a few cases where it’s not the best ﬁt for us we have key-‐value data that we generate nightly and want to be able to swap in an en@rely new dataset there are some specialized queries that are diﬃcult to make performant within mongo and some highly queried data where we’re very sensi@ve to latency variability 12

A lot of the data we use to service online
recommenda@on queries is calculated offline. That data might include venue ra@ngs based on aggregated signals, score data from machine learning models, etc. So we run MR reduce jobs to generate special files that contain efficient indexes preceeding the actual data Those hfiles are wriQen out to the distributed file system, the data may be split up so that we can fit the en@re data set into mul@ple machines memory. the hfile server processes read the data from HDFS. And then the applica@on servers use Thria RPC to send Key lookups to the hfile servers Zookeeper’s used to coordinate everything. I’ll men@on Zookeeper a few more @mes, it’s become the indispensible glue that holds everything together. 13

Tail mongo opera@on log Send data updates to kaia
queue One or more consumer might read the data from kaia and send it to a cache server A cache contains business logic around how to index that data into redis It also exposes rpc methods for clients to send queries And again all the accoun@ng for where things live is handled by zookeeper. 14

Part 2 Code. The year is 2009 and foursquare would
launch at SXSW in march of that year. Dennis and Naveen, the co-‐founders, split up Dev responsibili@es with Dennis on the server side and Naveen on the iPhone app. Dennis wrote the server side code in php and did a PreQy impressive job given his limited prior experience, but the code was an unmaintainable mess of inline sql, layout, and logic. No models, no views, no controllers. Later that year, Foursquare would hire their ﬁrst server engineer: Harry Heymann, who’s now the head of engineering. Harry came from a Java development background and was really intrigued with Scala. So he rewrote Foursquare using a scala web framework called Lia. That eﬀort was really about code maintainability and not scaling. 15

We built a trace tool to account for all the
RPC and database calls were executed by API endpoints. I think this is one of the most important things we built early on that was a huge help in inves@ga@ng performance problems. You’ll no@ce that you can also see the call site for the mongo query 16

We would also graph the number of RPC calls executed
by each endpoint so we would know when we introduced unintended regressions. This was one of the ﬁrst graphs we looked at when we no@ced a performance regression aaer a code push. 17

We also built a system that allows us to dynamically
switch code on or oﬀ. we call that throQles we can switch a feature on for speciﬁc user ids, or a consistent hashed percentage of users we use for developing new features and doing gradual rollouts among other things this is another indespensible tool for us. 18

Development with the codebase went preQy smoothly for a while.
19

For all that @me we con@nued to run all of
foursquare within the same binary. We had a giant monolithic codebase. 20

scala very slow to compile and our builds took many
minutes to compile. very high code churn, so high probability that something broke or regressed in which case the en@re process would have to be rolled back 21

Our ﬁrst eﬀort to break up the monolith was to
break apart the dependencies between dis@nct chunks of func@onality 22

Easy to build thria and hQp servers and clients
23

But we were worried that the easy of service crea@on
would lead to a cambrian explosion of services 25

Recently as we started to build swarm we had smaller
engineering @mes which own front to feature development of iphone/android applica@ons along with the server backends. those teams wanted to run quick experiment with new api endpoints which were exposed via the public api so that they could access them on the iphone clients. 34

this is basically all it takes to implement a new
endpoint. an engineer can run that from their development machine in ec2 and it will get traﬃc sent to those paths at api.foursquare.com by default the authoriza@on layer only allows employee access to the paths 36

The apirouter is a standalone process that takes the
incoming hQp request and looks up the requested path in the rou@ng table stored in zookeeper if it ﬁnds a service that can handle the request, it will do authen@ca@on, authoriza@on and forward the request to one of the servers in the pool that adver@sed handling for that path. the communica@on from the router to the remote endpoint is thria rpc 37

Future work 39

QCON NYC 2014: Scaling Foursquare: from check-i...

QCON NYC 2014: Scaling Foursquare: from check-ins to recommendations.

More Decks by Jon Hoffman

Other Decks in Technology

Featured

Transcript