Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Evolving Auth0's architecture: From 0 to 2.5+ billion logins per month in 5 years

Evolving Auth0's architecture: From 0 to 2.5+ billion logins per month in 5 years

In recent years, we’ve seen the emergence of a new form of technology scale. Today’s emerging technologies—which rapidly grow to millions of users—don’t sell products or services. Instead they build a platform on which others can create value. However, these new platforms often fail because the design and growth strategies involved in building them are complex, resource intensive, and expensive to scale. Despite this massive challenge, many companies in the identity and access management (IAM) and customer identity and access management (CIAM) space are still building their own IDaaS platform internally—and oftentimes failing to achieve their goals.

Damian Schenkelman dives into the complexities, resources, and scalability challenges Auth0 has faced in creating an IDaaS platform that securely manages more than 2.5 billion logins per month. You’ll explore specific scenarios including scaling password hashing, user search, and designing across multiple cloud regions, among others.

Damian Schenkelman

February 25, 2020
Tweet

More Decks by Damian Schenkelman

Other Decks in Programming

Transcript

  1. Evolving Auth0's
    architecture
    From 0 to 2.5+ billion logins per month in 5 years
    Damian Schenkelman
    Principal Engineer @ Auth0

    View Slide

  2. Auth0
    User
    Auth0
    Customer App

    View Slide

  3. 2014-2019

    View Slide

  4. Auth0 keeps going

    View Slide

  5. Ideal

    View Slide

  6. Pragmatic

    View Slide

  7. Lay of the land
    Scale Reliability Security
    User Management Protocols
    Session
    Management
    Authorization
    Anomaly Detection
    User Search
    Identity Providers Auditing Credential Stuffing
    Trust
    Pillars
    Features
    Experiences
    UIs Support SDKs Docs APIs

    View Slide

  8. Stories

    View Slide

  9. Environments

    View Slide

  10. Single Region
    AZ 1
    AZ 2
    AZ 3
    AWS Region

    View Slide

  11. Multi Region
    Failover AWS Region
    Main AWS Region

    View Slide

  12. Worldwide

    View Slide

  13. Scale

    View Slide

  14. Latency

    View Slide

  15. Failure Domains

    View Slide

  16. Data Sovereignty

    View Slide

  17. Cost

    View Slide

  18. IAM

    View Slide

  19. User Search

    View Slide

  20. email.domain:auth0.com
    AND logins_count:[0 TO 10}

    View Slide

  21. plan:"pro"

    View Slide

  22. theme:"butterflies"

    View Slide

  23. 2013
    MongoDB as
    the database
    Expose
    search

    View Slide

  24. “The code”
    if (opts.search) {
    var searchFilter = { $or: [
    { name: {'$regex': opts.search, '$options': 'i'} },
    { email: {'$regex': opts.search, '$options': 'i'} }
    ]};
    queryDocument = {$and: [queryDocument, searchFilter]};
    }

    View Slide

  25. 2015
    Case
    insensitive
    performance
    issues
    Inability to
    search on
    metadata
    Move to
    ElasticSearch

    View Slide

  26. Architecture
    User Data
    Auth0
    Authentication API
    User Store
    Auth0
    User Search API
    Indexer
    Kinesis

    View Slide

  27. 2017
    Overly
    permissive
    syntax
    High
    cardinality
    keys affected
    ES
    Move to
    Postgres

    View Slide

  28. Cardinality
    "zipCodes": {
    "98004": 1234,
    "98005": 5678,
    }
    "zipCodes": [
    { "value": "98004",
    "mapping": 1234 },
    { "value": "98005",
    "mapping": 5678 }
    ]

    View Slide

  29. Import

    View Slide

  30. Search

    View Slide

  31. Partitioning
    Users
    Single Tenant
    Partition N
    Single Tenant
    Partition 1
    Multi Tenant
    Partition N
    Multi Tenant
    Partition 1

    View Slide

  32. Tap Compare
    https://zachholman.com/talk/move-fast-break-nothing/

    View Slide

  33. Password Hashing

    View Slide

  34. Initial Scenario
    Client Access Token
    Username + Password
    Auth0
    Authentication API
    Auth0
    Identity Provider
    Users Store

    View Slide

  35. Expected

    View Slide

  36. Actual

    View Slide

  37. Flamegraph

    View Slide

  38. Bcrypt
    Is designed to be slow…

    View Slide

  39. Tradeoffs

    View Slide

  40. Bcrypt Service

    View Slide

  41. End Scenario
    Client Access Token
    Username + Password
    Auth0
    Authentication API
    Auth0
    Identity Provider
    Users Store
    bcrypt service

    View Slide

  42. Extensibility

    View Slide

  43. Kernel / Userland

    View Slide

  44. Pipeline

    View Slide

  45. Pipeline

    View Slide

  46. Extensibility

    View Slide

  47. Webhooks

    View Slide

  48. Simple

    View Slide

  49. Limit resources

    View Slide

  50. Quarantine

    View Slide

  51. Pre-emptive termination

    View Slide

  52. Architecture
    Warm Pool
    Runtime
    Management API
    Webtask Store
    code + data
    response

    View Slide

  53. Field enablement

    View Slide

  54. Discovery

    View Slide

  55. Continuous
    Authentication

    View Slide

  56. Targeted attacks

    View Slide

  57. Credential Stuffing
    Leaked
    Credentials
    Access
    Attempts

    View Slide

  58. In Numbers

    View Slide

  59. Trusted IPs
    129.31.55.2

    View Slide

  60. New device

    View Slide

  61. Impossible travel

    View Slide

  62. Impossible travel

    View Slide

  63. !
    Impossible travel

    View Slide

  64. Signals

    View Slide

  65. Extensible
    const confidence = context.anomalyDetection &&
    context.anomalyDetection.confidence || 'low';
    if (confidence === 'low') { /* block */ }
    if (confidence === 'medium') { /* ask for mfa */ }

    View Slide

  66. Scoring Model
    Client
    Credentials
    Auth0
    Authentication API
    Kinesis
    Logs
    Enhance
    Logs
    Kinesis
    Anonymized Auth
    Attempts Scoring Service

    View Slide

  67. Good behavior

    View Slide

  68. Bad Behavior

    View Slide

  69. In Conclusion

    View Slide

  70. View Slide

  71. Thanks
    Damian Schenkelman
    @dschenkelman

    View Slide

  72. Rate this session

    View Slide