Slide 1

Slide 1 text

Adam Hitchcock @NorthIsUp How Disqus Does SOA on Django

Slide 2

Slide 2 text

Adam Hitchcock @NorthIsUp How Disqus Does SOA on Django

Slide 3

Slide 3 text

psst, we’re hiring disqus.com/jobs If this is interesting to you...

Slide 4

Slide 4 text

TOC ๏ What is a Disqus? ๏ Why did you lie to us last year Adam? ๏ What is SOA? and Why should you SOA? ๏ Different Data patterns in SOA ๏ How Disqus does SOA ๏ Legacy example ๏ New service example ๏ Is SOA at Disqus a success?

Slide 5

Slide 5 text

Last year my talk was How Disqus does ‘it’ when ‘it’ isn’t Django

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

Why do I sit on a throne of lies? ๏ “Double down on Django” - my CTO ๏ leverage Django community ๏ standard practices makes hiring easier ๏ we are already really good at Django stuff ๏ Challenge assumptions and find ways to use Django outside of the normal web/request pattern

Slide 8

Slide 8 text

Adam Hitchcock @NorthIsUp Take 2: How Disqus Does ‘it’ Django Edition

Slide 9

Slide 9 text

Who already knows what SOA stands for?

Slide 10

Slide 10 text

SOA stands for: Service Oriented Architecture

Slide 11

Slide 11 text

What is a SOA? ๏ Architecting systems to contain… ๏ discrete software applications (services) ๏ simple, well defined interfaces (APIs) ๏ loose cooperation to perform a required function ๏ Two software roles in SOA ๏ service provider ๏ service consumer ๏ an app may play both roles

Slide 12

Slide 12 text

SOA is not a new idea “Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.” - Doug McIlroy

Slide 13

Slide 13 text

Services you already use ๏ Databases ๏ Postgres, MySQL, redis, etc. ๏ Queues ๏ Rabbit, Kafka (I know it’s not a queue…) ๏ External APIs ๏ Twitter, Facebook, etc. ๏ Single page javascript apps

Slide 14

Slide 14 text

Why SOA? ๏ Allows for heterogenous environment ๏ Data location transparency ๏ Small stable apis ๏ Independent scalability ๏ Easier testability ๏ Easier deployment ๏ Easier to maintain conceptual integrity

Slide 15

Slide 15 text

Data patterns in SOA

Slide 16

Slide 16 text

Data patterns of services ๏ Transactional Data ๏ REST ๏ Model access ๏ RPC (procedural) ๏ High logic endpoints (recommendations) ๏ Auth Systems ๏ Async Data ๏ Queues ๏ Pub/Sub

Slide 17

Slide 17 text

Stable APIs ๏ Pick your interface definition language (IDL) ๏ JSON, Protobuf, Thrift, etc. ๏ Pick your transport protocol ๏ HTTP, Thrift, etc. ๏ I like HTTP + JSON, Django is pretty good at it ๏ “Accept” header or “format” param ๏ “Connection: Keep-Alive”

Slide 18

Slide 18 text

REST + Django ๏ This is where Django already excels ๏ Django Rest Framework ๏ Or roll your own thin API

Slide 19

Slide 19 text

RPC + Django ๏ Useful for logic heavy APIs ๏ recommendation ๏ authentication and authorization ๏ Prone to overspecialized APIs ๏ RPC systems can hide network costs too much ๏ Thrift ๏ zerorpc

Slide 20

Slide 20 text

Async + Django ๏ High cpu or long running task ๏ Django Management Commands ๏ while True: do_work() ๏ Celery ๏ post_save hook + celery task ๏ easy to parse celery in any language ๏ go-celery ๏ Celery Beat for periodic tasks

Slide 21

Slide 21 text

Django is easy to run ๏ Django IO Loop ๏ easy to run ๏ easy to understand ๏ Multiple entry points into Django ๏ WSGI ๏ management command ๏ Celery task ๏ Celery beat

Slide 22

Slide 22 text

SOA at Disqus

Slide 23

Slide 23 text

Disqus Web, a legacy project

Slide 24

Slide 24 text

Disqus Web ๏ Monolithic Django project ๏ 183,108 lines of code ๏ Over 7 years old ๏ Lots of bad decisions

Slide 25

Slide 25 text

Deployment ๏ Deploy the entire code base (as a lib) ๏ Cluster machines by purpose ๏ cpu/memory/network patterns emerge ๏ makes scale planning easier ๏ Services routed to based on hostname + path ๏ Three phase deployment

Slide 26

Slide 26 text

Entry points ๏ Different entry points to change purpose ๏ DJANGO_SETTINGS_MODULE ๏ multiple settings.py ๏ multiple urls.py ๏ Using different settings.py files we can… ๏ load different middleware ๏ load different url resolvers ๏ url resolution is expensive ๏ different template request contexts

Slide 27

Slide 27 text

Example Services ๏ Public api ๏ a ton of middleware ๏ hundreds of url routes ๏ lots of automatic request context ๏ Internal objects api ๏ no middleware ๏ one url route ๏ no request context for transformers

Slide 28

Slide 28 text

Did it work?

Slide 29

Slide 29 text

It works fine

Slide 30

Slide 30 text

Did it work? ๏ Problems typical of a large code base ๏ Version conflicts still problematic ๏ internal function api changes ๏ eternal package upgrades ๏ Conceptual integrity still hard ๏ you can only remember so many lines of code ๏ Constantly integrating with entire code base

Slide 31

Slide 31 text

Did it work? ✓ Allows for heterogenous environment ✓ Data location transparency Small stable apis ✓ Independent scalability Easier testability ✓ Easier deployment Easier to maintain conceptual integrity

Slide 32

Slide 32 text

Disqus Ads Server, a new project

Slide 33

Slide 33 text

The Disqus Ads server ๏ Use Django apps for encapsulation ๏ Leverage Django beyond WSGI ๏ Multiple code bases ๏ only one codebase can access the DB directly ๏ others access via REST or RPC APIs

Slide 34

Slide 34 text

Lots o’ services ๏ Ads Data API ๏ Django REST framework ๏ minimal RPC endpoints ๏ Ads Serving API ๏ RPC endpoint ๏ Ads Scoring & Ads Cache/Time-Series Warming ๏ Management command ๏ Ads Data Import ๏ Celery + Celery Beat

Slide 35

Slide 35 text

Code organization ๏ Ads Data service ๏ Ads Data Import service ๏ 11,400 lines ๏ Ads Serving service ๏ Ads Scoring service ๏ Ads Cache Warming service ๏ Ads Time-Series Warming service ๏ 11,185 lines

Slide 36

Slide 36 text

What does it look like? The Internet Ads Data API Ads Serving API Cache Legacy Disqus Web Monolithic Ads Scoring Service Ads Data Import Service Internal Ads Tooling Advertiser dashboard Disqus Embed Gutter Feature Switch Service Ads Cache Warming Service Ads Time-Series Warming Service Has ORM access No ORM access

Slide 37

Slide 37 text

What does it look like? The Internet Django uwsgi REST Django uwsgi RPC Redis Django + a million custom things Django Celery Beat javascript backbone javascript backbone Disqus Embed Django uwsgi Django manage.py command Django manage.py command Django Celery Beat Has ORM access No ORM access

Slide 38

Slide 38 text

Did it work?

Slide 39

Slide 39 text

uhmuzing

Slide 40

Slide 40 text

Did it work? ๏ Harder to share code between services ๏ need to use an external packages ๏ Django best practices help a lot long term ๏ Easy to understand the entire system ๏ easy to quickly add + test code ๏ integration tests are more important ๏ service apis live longer, need more support ๏ Fast deploys and tests ๏ Ease of scalability

Slide 41

Slide 41 text

Did it work? ✓ Allows for heterogenous environment ✓ Location transparency ✓ Small stable apis ✓ Independent scalability ✓ Easier testability ✓ Easier deployment ✓ Easier to maintain conceptual integrity

Slide 42

Slide 42 text

Is SOA a success for Disqus?

Slide 43

Slide 43 text

No content

Slide 44

Slide 44 text

Is SOA a success for Disqus? ๏ Easier to run over all ๏ Easier to understand new systems ๏ Easier to not break existing systems

Slide 45

Slide 45 text

Roundup ๏ “Do one thing and do it well” ๏ Know what data pattern you are solving for ๏ Stick to your API decisions ๏ protocol ๏ transport ๏ Django has multiple entry points, use them

Slide 46

Slide 46 text

Links ๏ Support Django REST Framework on Kickstarter ๏ kickstarter.com/projects/tomchristie/django- rest-framework-3 ๏ django-rest-framework.org ๏ github.com/mattrobenolt/go-celery ๏ lincolnloop.com/django-best-practices/ ๏ en.wikipedia.org/wiki/Unix_philosophy

Slide 47

Slide 47 text

psst, we’re hiring disqus.com/jobs If this was interesting to you...

Slide 48

Slide 48 text

No content

Slide 49

Slide 49 text

Questions!

Slide 50

Slide 50 text

Questions for me What was the most challenging part of designing the system?” - Tom Christie of Django REST Framework “I thought you would say ‘designing it so it doesn't go horribly horribly wrong when one part breaks’” - Also Tom Christie of Django REST Framework

Slide 51

Slide 51 text

Questions for you ๏ How do you make maintainable RPC endpoints? ๏ Why are the best methods of service discovery? ๏ Which is better one codebase vs. many? ๏ What is a microservice?

Slide 52

Slide 52 text

`