Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Softonic Migration to the Cloud

Softonic Migration to the Cloud

Architecture changes and data migrations.

We show that we split a monolith into microservices and how we managed the data changes to move everything to a Google Cloud, using their Kubernetes managed solution, GKE.

Presented during the Google Cloud Summit Madrid'19

Softonic

June 12, 2019
Tweet

More Decks by Softonic

Other Decks in Technology

Transcript

  1. Who are we? 20-year old Internet property World's largest software

    and app catalog Translated to many languages: EN, DE, ES, IT... => 20! (includes exotic languages like Kiswahili)
  2. Who are we? 20-year old Internet property World's largest software

    and app catalog Translated to many languages: EN, DE, ES, IT... => 20! (includes exotic languages like Kiswahili) 4M daily visits, 12M daily page views 10K docs/s written in logs in peak time on our services
  3. Who are we? Basilio Vera (@basi) Senior Principal Software Engineer

    16 years working in Softonic Master of none
  4. Softonic Production Infrastructure We came from a Monolithic application deployed

    in bare-metal datacenter We have migrated “everything” to SOA and all our workloads run in a cloud provider: Google Cloud
  5. Softonic Production Infrastructure We came from a Monolithic application deployed

    in bare-metal datacenter We have migrated “everything” to SOA and all our workloads run in a cloud provider: Google Cloud Google Kubernetes Engine (GKE) running in different regions and zones
  6. Softonic Production Infrastructure We came from a Monolithic application deployed

    in bare-metal datacenter We have migrated “everything” to SOA and all our workloads run in a cloud provider: Google Cloud Google Kubernetes Engine (GKE) running in different regions and zones We use Elasticstack for logs processing (and runtime database!)
  7. Softonic Production Infrastructure We came from a Monolithic application deployed

    in bare-metal datacenter We have migrated “everything” to SOA and all our workloads run in a cloud provider: Google Cloud Google Kubernetes Engine (GKE) running in different regions and zones We use Elasticstack for logs processing (and runtime database!) Prometheus & Grafana for monitoring
  8. Legacy Architecture Binary Providers N Servers N Servers Users EN

    Users ES DB EN Master DB EN Replica DB EN N Replicas SD C CMS DB ES Master DB ES Replica DB ES N Replicas … … Users BCN SADS Download API PHP
  9. Cloud Architecture - Main datacenter Users CM S Noodle Web

    Download API PHP Affiliation CMS Users Data Comments Data Apisoba Hapi User Rating API PHP CloudSQL Developer API PHP CloudSQL MAGNET CloudSQL Experiments API Hapi SDH CMS Autocat API Internal Users External Users Internal Users Catalog API PHP CloudSQL Categories API Hapi Noodle Setup API PHP CloudSQL Affiliation API PHP CloudSQL Binary Providers
  10. Cloud Architecture - Other datacenters Binary Providers Users Download API

    PHP Apisoba Hapi Affiliation API PHP Cloud SQL Noodle Web Users Data Comments Data
  11. Cloud Multidatacenter Users Traffic flow USA users JP users FR

    users Google Global Load Balancer (MCI) Main Datacenter Europe Datacenter Asia Southeast Datacenter USA West
  12. When extracting a monolith to a microservice we are extracting

    the functional code and its data We need to be sure that its data is accessed and modified just by it Migrating data to a microservice
  13. When extracting a monolith to a microservice we are extracting

    the functional code and its data We need to be sure that its data is accessed and modified just by it Business rules: ◦ Don’t affect other teams productivity ◦ Don’t affect end-user products ◦ Make all the changes transparently Migrating data to a microservice
  14. Options to migrate data and architecture: The hard way Create

    huge ETL Create new API that uses the new data structure Modify all the products that make use of the data to use the API instead Release all in once Migrating data to a microservice
  15. Options to migrate data and architecture: The hard way Create

    huge ETL Create new API that uses the new data structure Modify all the products that make use of the data to use the API instead Release all in once The soft way Create new API with the capability of write/read in legacy and new datasource Execute ETL Maintain both datasources synced until the migration of the products is done Remove legacy datasource and database Migrating data to a microservice
  16. We like the soft way You need to define the

    strategy to replicate on each API Migrating data to a microservice
  17. Writes on both datasources Reads to legacy Migrate initial data

    Check new data is consistent Migrating data to a microservice
  18. Writes on both datasources Reads to new datasource Migrate services

    to use the API Migrating data to a microservice
  19. Migration finished All the products have been migrated to use

    this API The legacy datasource is removed Migrating data to a microservice
  20. Migrating data to a microservice Very complex logic in data

    repository Many edge cases that need deep check Business changes totally frozen during transition
  21. Data in the cloud We use Kubernetes in GKE Our

    microservices run in Kubernetes clusters What about stateful applications? Most of our microservices are stateful It depends…
  22. Data in the cloud We use MySQL as database for

    some microservices We use Elasticsearch as database for storing logs and end-user services We use Redis for cache and database We use Memcached for cache
  23. MySQL Be sure your data is available Be sure your

    data has backup Be sure you can scale Have a B plan
  24. MySQL Integrate it with your cloud provider Define your dependencies

    programmatically GitOps -> Terraform resource "google_sql_database_instance" "sft-developer-api" { name = "sft-developer-api" project = "${var.project}" database_version = "MYSQL_5_7" region = "europe-west1" settings { tier = "db-n1-standard-1" ip_configuration { ipv4_enabled = "true" private_network = "${var.softonic_network}" } backup_configuration { binary_log_enabled = true enabled = true } user_labels { app = "developer-api" site = "${terraform.workspace}" } } }
  25. MySQL Integrate it with your cloud provider Define your dependencies

    programmatically GitOps -> Terraform resource "google_sql_database_instance" "sft-developer-api" { name = "sft-developer-api" project = "${var.project}" database_version = "MYSQL_5_7" region = "europe-west1" settings { tier = "db-n1-standard-1" ip_configuration { ipv4_enabled = "true" private_network = "${var.softonic_network}" } backup_configuration { binary_log_enabled = true enabled = true } user_labels { app = "developer-api" site = "${terraform.workspace}" } } }
  26. MySQL We have some some microservices running in different regions

    Try to avoid data replication across regions Protocols designed for LAN, different regions mean WAN Application design+Business requirements+compromise solution
  27. MySQL Recommendations: use private IP for connect to the database

    It allows you to not to use Cloud SQL Proxy* *Cloud SQL Proxy: https://cloud.google.com/sql/docs/mysql/sql-proxy It needs specific Kubernetes deployment or run as a sidecar
  28. MySQL What about run MySQL in the cluster? Initially we

    decided to use CloudSQL because of reasons: PROS Simplicity Automatic backup “Automatic” vertical autoscaling Peace of mind while upgrading the cluster CONS No REAL HA More tied to your cloud provider More expensive than running it as a container in Kubernetes
  29. Elasticsearch Initially out of the cluster, using elastic.co magic running

    in GCE Moved into the cluster because... it’s a cluster! Elasticsearch allows you to shard and replicate the data We use a replication factor of 3 It allows to have 2 Kubernetes nodes with downtime without Elasticsearch downtime Deployed via Helm chart
  30. Elasticsearch Initially out of the cluster, using elastic.co magic running

    in GCE Moved into the cluster because... it’s a cluster! Elasticsearch allows you to shard and replicate the data We use a replication factor of 3 It allows to have 2 Kubernetes nodes with downtime without Elasticsearch downtime Deployed via Helm chart Soon you could use the SaaS from Google
  31. Elasticsearch: Upgrading K8s cluster There’s a preStop step in the

    data PODs that starts moving all its data to other shards Next node cannot start the upgrade until the earlier node has moved all its data It makes the upgrade slow but secure
  32. Elasticsearch: Data We replicate data across different clusters asynchronously We

    use a CQRS-like approach generating a “materialized-view”