Upgrade to Pro — share decks privately, control downloads, hide ads and more …

TomTom NavCloud on AWS

TomTom NavCloud on AWS

A talk from AWS Amsterdam meetup

Dmitry Ivanov

April 29, 2014
Tweet

More Decks by Dmitry Ivanov

Other Decks in Programming

Transcript

  1. NAVCLOUD ON AWS
    AWS Amsterdam Meetup!
    April 29th 2014

    View full-size slide

  2. NAVCLOUD
    • A cloud-based storage service that allows users
    to seamlessly synchronize trip information between
    devices as well as share and receive navigation
    information with other people (e.g., friends or
    companies).
    • NavCloud aims to be scalable and reactive while
    ensuring privacy and security.

    View full-size slide

  3. The Team
    Full stack developers
    • Server
    • Mobile / SDKs
    • Systems / AWS

    View full-size slide

  4. Architecture
    Riak Cluster
    HTTP(s)
    API node


    HTTP(s)
    API node
    HTTP(s)
    API node
    Clients

    View full-size slide

  5. Architecture
    Horizontal scaling
    • Stateless* API nodes.
    • No direct interconnection between API nodes.
    • Riak scales horizontally very well.

    View full-size slide

  6. But Why AWS?
    • Embracing DevOps.
    • Embracing (horizontal) scalability.
    • A whole ecosystem of different services helping to
    solve (almost) any task.

    View full-size slide

  7. Our AWS approach
    • We allocate resources with CloudFormation stacks.
    • We build stacks inside of VPC.
    • We use S3 to store files, backups and logs.
    • We manage our DNS records with Route53.

    View full-size slide

  8. So What About AWS?

    View full-size slide

  9. Syncing Data
    Problem
    Client wants to receive updates in background.
    Solutions
    Client Server
    • Polling
    Client Server
    • Streaming

    View full-size slide

  10. Streaming
    Riak Cluster
    HTTP API
    node …

    HTTP API
    node
    HTTP API
    node
    Clients
    HTTP 1.1
    chunked
    HTTP 1.1
    chunked
    HTTP 1.1
    chunked

    View full-size slide

  11. Streaming on AWS

    View full-size slide

  12. Streaming on AWS
    • All requests go via ELB.
    • Using TCP/SSL listener instead. Proxy protocol
    support.
    • ELB closes connections with timeout (60s).
    • Sending ‘heartbeat’ messages to keep connection
    alive.
    • HTTP(s) listener: issue with RST tcp message.

    View full-size slide

  13. Your Own Load Balancer?
    • HAProxy, Nginx, Apache.
    • Full and real-time access to logs.
    • Configurability.
    • But: HA setup? Multi-AZ?
    • But: Security setup.
    Tradeoff: configurability vs simplicity

    View full-size slide

  14. Streaming on AWS Improved

    View full-size slide

  15. Streaming on AWS Improved
    • Amazon’s good practice.
    • But: API node should be directly accessible.
    • More improvements: distributed events (across
    API nodes) instead of polling storage for updates.
    • Message Queue (RabbitMQ cluster) 

    with Fanout pattern.
    • AWS alternative: google SNS + SQS fanout pattern.

    View full-size slide

  16. Streaming on AWS Improved

    View full-size slide

  17. ELB ‘features’ :)
    • Performance tests. Pre-warming.
    • Really easy to hit it beyond ~10K concurrent
    connections.
    • Request Amazon support to pre-warm or just run
    tests for some time without measuring.
    • Logs access. Improved lately (export to S3) …

    View full-size slide

  18. Monitoring
    • We are investigating StackDriver (stackdriver.com).
    • Third-party monitoring tool with rich and
    customizable UI.
    • Custom application metrics.
    • Supports monitoring of a lot of standard services out
    of the box: Riak, Message Queue services, App
    containers.

    View full-size slide

  19. Provisioning
    CloudFormation!
    • JSON script that describes the whole stack.
    • Automatic resources lifecycle management.
    • VPC, Security, Route53 records, S3, EC2 ->
    everything is managed inside CF scripts.
    • Currently we are stuck with monolithic CF

    script -> 3000 LOC. Not very manageable.

    View full-size slide

  20. Deployment
    • We use Python boto library to talk to AWS
    services. Including calling our own scripts
    during CloudFormation stack setup.
    • Python scripts + shell scripts (AWS SDK CLI).
    • Capistrano for doing distributed tasks.

    View full-size slide

  21. Capistrano
    Capistrano - a remote server automation and
    deployment tool written in Ruby.
    • Agent-less: Needs ssh and POSIX-compatible shell.
    That’s it.
    • Routing out of the box (connecting via ssh router).

    View full-size slide

  22. Capistrano with CF stacks
    Problem: dynamic nature of AWS resources. IP
    addresses can’t be hardcoded.
    Solution: Auto-discovery of CF resources 

    (e.g. stacks, hosts) is a part of Capistrano job.

    View full-size slide

  23. Capistrano with CF stacks

    View full-size slide

  24. Capistrano with CF stacks
    • lsfleet is a simple shell script that queries the
    CloudFormation API and returns ip addresses of
    instances within supplied Auto-Scaling Group.
    • Could be done even easier with Ruby AWS SDK.

    View full-size slide

  25. Capistrano Use Cases
    • Distributing application across the whole App stack
    (Deploying to different ‘dev’ CF stacks).
    • Gathering log files.
    • Getting some OS-related stats from all nodes.
    Interactively invoke commands on all nodes of ASG

    View full-size slide

  26. Capistrano: why bother?
    Before!
    • A huge (480+304 LOC) shell scripts for app deployment.
    • Doing manual ssh routing, etc.
    After!
    • Capfile ~70 LOC & helper shell scripts (50+153 LOC)
    • Easier to maintain.
    • Capistrano params: easier configurable.

    View full-size slide

  27. ‘Switching’ The Stacks
    • Allows to fully automate dev environment updates. 

    Can be a Continuos Integration job!
    • Decreasing the downtime.
    • Procedure:
    1. Provision the new CF stack using Python boto script.
    2. Download & Apply the latest backup from S3 using shell
    script & s3 cmd tool.
    3. Switch the Route53 DNS record using AWS API.

    View full-size slide

  28. Questions?
    Dmitry Ivanov @idajantis
    [email protected]
    Vincenzo Vitale [email protected]
    Nami Nasserazad [email protected] @nami4552
    @sicilianamente

    View full-size slide