Slide 1

Slide 1 text

Roll-your-own API Management Platform 
 with NGINX and Lua Sean Cribbs, Comcast Cable @seancribbs

Slide 2

Slide 2 text

About me

Slide 3

Slide 3 text

Background

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

Consumer

Slide 6

Slide 6 text

Internal Consumer

Slide 7

Slide 7 text

Partner Internal Consumer

Slide 8

Slide 8 text

API Management

Slide 9

Slide 9 text

API Management access control capacity management

Slide 10

Slide 10 text

CodeBig 1 API Consumer

Slide 11

Slide 11 text

CodeBig 1 API Consumer CDN • traffic shaping • caching

Slide 12

Slide 12 text

CodeBig 1 API Consumer CDN • traffic shaping • caching • access control • rate limiting Vendor APIM

Slide 13

Slide 13 text

CodeBig 1 API Consumer CDN • traffic shaping • caching • access control • rate limiting Vendor APIM Internet Comcast LB • DNS RR • VIP

Slide 14

Slide 14 text

CodeBig 1 API Consumer CDN • traffic shaping • caching • access control • rate limiting Vendor APIM • DMZ intermediary • path-host mapping iAuth Internet Comcast LB • DNS RR • VIP

Slide 15

Slide 15 text

CodeBig 1 API Consumer CDN • traffic shaping • caching • access control • rate limiting Vendor APIM • DMZ intermediary • path-host mapping iAuth Internet Comcast Origin APIs LB • DNS RR • VIP

Slide 16

Slide 16 text

Challenges

Slide 17

Slide 17 text

Challenges visibility

Slide 18

Slide 18 text

Challenges visibility responsibility

Slide 19

Slide 19 text

Challenges visibility responsibility scope

Slide 20

Slide 20 text

Challenges visibility responsibility scope latency

Slide 21

Slide 21 text

Challenges visibility responsibility scope latency security

Slide 22

Slide 22 text

CodeBig 2

Slide 23

Slide 23 text

CodeBig 2 simplify architecture

Slide 24

Slide 24 text

CodeBig 2 simplify architecture increase visibility

Slide 25

Slide 25 text

CodeBig 2 simplify architecture increase visibility use open-source tools

Slide 26

Slide 26 text

Architecture

Slide 27

Slide 27 text

custom  logic HTTP Proxy

Slide 28

Slide 28 text

Lua

Slide 29

Slide 29 text

-­‐-­‐  Validates  the  OAuth  signature   -­‐-­‐  @return  Const.HTTP_UNAUTHORIZED  if  either  the  key  or  signature  is  invalid   -­‐-­‐  this  method  is  internal  and  should  not  be  called  directly   function  _M.validate_signature(self)          local  headers  =  self.req.get_oauth_params()          local  key  =  headers[Const.OAUTH_CONSUMER_KEY]          local  keyconf  =  self.conf.keys[key]          if  keyconf  ==  nil  then                  return  {                          code  =  Const.HTTP_UNAUTHORIZED,                          error  =  Const.ERROR_INVALID_CONSUMER_KEY                  }          end          local  sig  =  get_hmac_signature(self.req,  keyconf.secret)          if  sig  ~=  headers[Const.OAUTH_SIGNATURE]  then                  return  {                          code  =  Const.HTTP_UNAUTHORIZED,                          error  =  Const.ERROR_INVALID_SIGNATURE                  }          end   end

Slide 30

Slide 30 text

Lua < 3K LoC!!

Slide 31

Slide 31 text

Intra-Datacenter VIP

Slide 32

Slide 32 text

Intra-Datacenter VIP haproxy haproxy …

Slide 33

Slide 33 text

Intra-Datacenter VIP … haproxy haproxy …

Slide 34

Slide 34 text

Intra-Datacenter VIP Origin APIs … haproxy haproxy …

Slide 35

Slide 35 text

DC3 VIP DC2 VIP DC1 VIP Cross-Datacenter

Slide 36

Slide 36 text

DC3 VIP DC2 VIP DC1 VIP Cross-Datacenter dc1-­‐vip.              A              10.1.0.1 dc2-­‐vip.              A              10.2.0.1 dc3-­‐vip.              A              10.3.0.1

Slide 37

Slide 37 text

DC3 VIP DC2 VIP DC1 VIP Cross-Datacenter vod vod acct acct dc1-­‐vip.              A              10.1.0.1 dc2-­‐vip.              A              10.2.0.1 dc3-­‐vip.              A              10.3.0.1

Slide 38

Slide 38 text

DC3 VIP DC2 VIP DC1 VIP Cross-Datacenter vod vod acct acct dc1-­‐vip.              A              10.1.0.1 dc2-­‐vip.              A              10.2.0.1 dc3-­‐vip.              A              10.3.0.1 vod-­‐dc1.              CNAME      dc1-­‐vip. vod-­‐dc2.              CNAME      dc2-­‐vip.

Slide 39

Slide 39 text

DC3 VIP DC2 VIP DC1 VIP Cross-Datacenter vod vod acct acct dc1-­‐vip.              A              10.1.0.1 dc2-­‐vip.              A              10.2.0.1 dc3-­‐vip.              A              10.3.0.1 vod-­‐dc1.              CNAME      dc1-­‐vip. vod-­‐dc2.              CNAME      dc2-­‐vip. vod-­‐dc1-­‐fo.        CNAME      dc1-­‐vip. vod-­‐dc1-­‐fo.        CNAME      dc2-­‐vip.

Slide 40

Slide 40 text

DC3 VIP DC2 VIP DC1 VIP Cross-Datacenter vod vod acct acct dc1-­‐vip.              A              10.1.0.1 dc2-­‐vip.              A              10.2.0.1 dc3-­‐vip.              A              10.3.0.1 vod-­‐dc1.              CNAME      dc1-­‐vip. vod-­‐dc2.              CNAME      dc2-­‐vip. vod-­‐dc1-­‐fo.        CNAME      dc1-­‐vip. vod-­‐dc1-­‐fo.        CNAME      dc2-­‐vip. vod.                      CNAME      vod-­‐dc1. vod.                      CNAME      vod-­‐dc2.

Slide 41

Slide 41 text

Capacity Management

Slide 42

Slide 42 text

N = XR

Slide 43

Slide 43 text

N = XR # concurrent requests

Slide 44

Slide 44 text

N = XR # concurrent requests transaction rate

Slide 45

Slide 45 text

N = XR # concurrent requests transaction rate response time

Slide 46

Slide 46 text

N = XR # concurrent requests transaction rate response time Little’s Law

Slide 47

Slide 47 text

client origin APIM 2 req/s 1s N = XR = 2 req/s x 1s = 2 req

Slide 48

Slide 48 text

client origin APIM 2 req/s 10s N = XR = 2 req/s x 10s = 20 req

Slide 49

Slide 49 text

client origin APIM 2 req/s 10s N = XR = 2 req/s x 10s = 20 req

Slide 50

Slide 50 text

Concurrent Request Limiting lua_shared_dict              counts    50M;   access_by_lua          …      +1   log_by_lua                …      -­‐1

Slide 51

Slide 51 text

Testing

Slide 52

Slide 52 text

function  TestOAuth1:test_reject_request_when_timestamp_expired()          -­‐-­‐  default  timestamp  is  01/01/2014  UTC  so  it's  definitely  stale          local  conf  =  Conf:new({validate_timestamp  =  true,  ttl  =  300})          local  header  =  Header:new()          local  req  =  Req:new({oauth_params  =  header})          local  oauth  =  OAuth1:new(conf,  req)          local  res  =  oauth:authorize()          assertEquals(res.code,  Const.HTTP_UNAUTHORIZED)          assertEquals(res.error,  Const.ERROR_EXPIRED_TIMESTAMP)   end   lu  =  LuaUnit.new()   lu:setOutputType("tap")   os.exit(lu:runSuite())

Slide 53

Slide 53 text

ngx.log(ngx.ERR,  “oops”)

Slide 54

Slide 54 text

function  get_oauth_params_from_auth_header(env)      env  =  env  or  ngx      local  auth_hdrs  =  env.req.get_headers()["Authorization"]      ...

Slide 55

Slide 55 text

function  TestRequest:test_retrieve_oauth_params_from_header()      local  header  =  [[Oauth  realm="example.com",  ]]          ..  [[oauth_consumer_key="mykey",]]          ..  [[oauth_version="1.0"]]      local  ngx  =  StubNgx:new({  Authorization  =  header  })      local  res  =  get_oauth_params_from_auth_header(ngx)      assertEquals("1.0",  res.oauth_version)      ...

Slide 56

Slide 56 text

No content

Slide 57

Slide 57 text

No content

Slide 58

Slide 58 text

test harness

Slide 59

Slide 59 text

test harness

Slide 60

Slide 60 text

test harness

Slide 61

Slide 61 text

test harness

Slide 62

Slide 62 text

test harness nginx-­‐oauth1.conf

Slide 63

Slide 63 text

test harness nginx-­‐oauth1.conf

Slide 64

Slide 64 text

test harness nginx-­‐oauth1.conf

Slide 65

Slide 65 text

def  spec_using_valid_oauth_credentials(self,  harness):          auth  =  OAuth1("mykey",  "mysecret")          body_data  =  "{'function':  'tick'}"          harness.reset_data()          response  =  requests.post(root_url,  data=body_data,  auth=auth,                                                            headers={'Content-­‐Type':                                                                              'application/json'})          assert  response.status_code  ==  200          assert  "Authorization"  in  harness.forwarded_headers          assert  "Oauth"  in  harness.forwarded_headers["Authorization"]          assert  harness.forwarded_body  ==  body_data  

Slide 66

Slide 66 text

Deployment

Slide 67

Slide 67 text

No content

Slide 68

Slide 68 text

Configs in VCS Playbooks in VCS

Slide 69

Slide 69 text

Configs in VCS config templates API configs vault
 (keys) Playbooks in VCS

Slide 70

Slide 70 text

Configs in VCS config templates API configs vault
 (keys) Playbooks in VCS

Slide 71

Slide 71 text

Configs in VCS config templates API configs vault
 (keys) Playbooks in VCS vip.conf vhost.lua vhost.conf vhost.conf vhost.lua nginx.conf

Slide 72

Slide 72 text

Configs in VCS config templates API configs vault
 (keys) Playbooks in VCS vip.conf vhost.lua vhost.conf vhost.conf vhost.lua nginx.conf

Slide 73

Slide 73 text

Configs in VCS config templates API configs vault
 (keys) ssh Playbooks in VCS vip.conf vhost.lua vhost.conf vhost.conf vhost.lua nginx.conf

Slide 74

Slide 74 text

Configs in VCS config templates API configs vault
 (keys) ssh Playbooks in VCS vip.conf vhost.lua vhost.conf vhost.conf vhost.lua nginx.conf

Slide 75

Slide 75 text

Results

Slide 76

Slide 76 text

Performance

Slide 77

Slide 77 text

switch Performance

Slide 78

Slide 78 text

switch mean 99th Performance

Slide 79

Slide 79 text

switch ~10x mean 99th Performance

Slide 80

Slide 80 text

Stability

Slide 81

Slide 81 text

switch Stability

Slide 82

Slide 82 text

Impact index=codebig  host=*.cimops.net  source="/var/log/nginx/access.log"  |   eval  d  =  request_time  -­‐  upstream_response_time  |  
 timechart  span=1m  perc99(d)  max(d)

Slide 83

Slide 83 text

Impact seconds index=codebig  host=*.cimops.net  source="/var/log/nginx/access.log"  |   eval  d  =  request_time  -­‐  upstream_response_time  |  
 timechart  span=1m  perc99(d)  max(d)

Slide 84

Slide 84 text

Impact seconds index=codebig  host=*.cimops.net  source="/var/log/nginx/access.log"  |   eval  d  =  request_time  -­‐  upstream_response_time  |  
 timechart  span=1m  perc99(d)  max(d) 99th

Slide 85

Slide 85 text

Impact seconds index=codebig  host=*.cimops.net  source="/var/log/nginx/access.log"  |   eval  d  =  request_time  -­‐  upstream_response_time  |  
 timechart  span=1m  perc99(d)  max(d) 99th max

Slide 86

Slide 86 text

Successes

Slide 87

Slide 87 text

Successes great performance improvements

Slide 88

Slide 88 text

Successes great performance improvements 68 endpoints in August

Slide 89

Slide 89 text

Successes great performance improvements 68 endpoints in August ~150 endpoints in September

Slide 90

Slide 90 text

Challenges

Slide 91

Slide 91 text

Challenges 3rd-party Lua ecosystem

Slide 92

Slide 92 text

Challenges 3rd-party Lua ecosystem deployment playbook churn

Slide 93

Slide 93 text

Challenges 3rd-party Lua ecosystem deployment playbook churn configuration file size

Slide 94

Slide 94 text

Challenges 3rd-party Lua ecosystem deployment playbook churn configuration file size kernel tuning

Slide 95

Slide 95 text

Challenges 3rd-party Lua ecosystem deployment playbook churn configuration file size kernel tuning owning availability

Slide 96

Slide 96 text

Conclusion

Slide 97

Slide 97 text

Conclusion NGINX + Lua for HTTP middleware

Slide 98

Slide 98 text

Conclusion NGINX + Lua for HTTP middleware Automated test and deployment pipeline

Slide 99

Slide 99 text

Conclusion NGINX + Lua for HTTP middleware Automated test and deployment pipeline Concurrent request limiting

Slide 100

Slide 100 text

Conclusion NGINX + Lua for HTTP middleware Automated test and deployment pipeline Concurrent request limiting Operational flexibility

Slide 101

Slide 101 text

Thanks