Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deploying and operating GeoServer: a DevOps perspective (FOSS4G 2022 Edition)

950

Deploying and operating GeoServer: a DevOps perspective (FOSS4G 2022 Edition)

Cloud computing is revolutionizing the way companies develop, deploy and operate software and GeoSpatial software is no exception. With benefits of cloud based deployments range from cost savings to simplified management, flexibility, lower downtime and scalability of dynamic environments it is easy to understand why more and more companies are migrating their on premise systems to the cloud but cloud based setups have their own set of hurdles and challenges.
The migration of the series itself can be challenging. Monitoring, debugging and scaling of applications are very much different than what you are used to.
In this presentation we will share with you the lessons we have learned at GeoSolutions to tackle these problems and share some common patterns for the migration of on premise GeoServer clusters to the cloud. We'll share with you tips on how to:

best practices to migrate your existing GeoServer cluster to the cloud
insights on your geoserver cluster using centralized logging and Monitor plugin
avoid common bottlenecks to best set up a distributed scalable GeoServer cluster 
work containers and container orchestrators like Kubernetes

Simone Giannecchini
PRO

August 31, 2022
Tweet

Transcript

  1. Alessandro Parma
    Luca Pasquali
    GeoSolutions
    Deploying and operating GeoServer:
    a DevOps perspective

    View Slide

  2. Introduction

    View Slide

  3. GeoSolutions
    Enterprise Support
    Services
    Deployment
    Subscription
    Professional
    Training
    Customized
    Solutions
    GeoNode
    • Offices in Italy & US, Global Clients/Team
    • 40+ collaborators, 30+ Engineers
    • Our products
    • Our Offer

    View Slide

  4. Affiliations
    We strongly support Open
    Source, it Is in our core
    We actively participate in
    OGC working groups and
    get funded to advance new
    open standards
    We support standards
    critical to GEOINT

    View Slide

  5. What is GeoServer?
    (really??)

    View Slide

  6. What is GeoServer?
    • GeoSpatial enterprise gateway
    • Manage and Disseminate of raster and vector data
    • Standards compliant
    • OGC WCS 1.0, 1.1.1 (RI), 2.0
    • OGC WFS 1.0, 1.1 (RI), 2.0
    • OGC WMS 1.1.1, 1.3.0
    • OGC WPS 1.0.0
    • OGC CSW 2.0.1 (ebRIM)
    • Google Earth/Maps support
    • KML, GeoSearch, etc..

    View Slide

  7. What is GeoServer?
    GeoServer
    WFS
    WMS
    PostGIS
    Oracle
    H2
    DB2
    SQL
    Server
    GeoPacka
    ge
    MySql
    Spatialite
    Elastic
    MongoDB
    Shapefile ----------
    ----------
    ---------
    ----------
    ----------
    ----------
    ---------
    ----------
    ----------
    ----------
    ---------
    ----------
    WFS
    PNG, GIF
    JPEG
    TIFF,
    GeoTIFF
    SVG, PDF
    KML/KMZ
    Shapefile
    GML2
    GML3
    GeoRSS
    GeoJSON
    CSV/XLS
    GeoPackage
    Raw vector
    data
    Servers
    Styled
    maps
    DBMS
    Vector files
    WCS
    GeoTIFF
    WMS
    ArcGrid
    Img+world
    Mosaic
    MrSID
    JPEG 2000
    ECW,Pyramid,
    Oracle
    GeoRaster,
    PostGis
    Raster files
    Raw raster
    data
    GeoTIFF
    ArcGrid
    GTopo30
    Img+World
    WMTS,
    TMS
    KML superoverlays
    Google maps tiles
    OGC tiles
    OSGEO tiles
    KML
    WPS
    CSW
    ESRI
    REST

    View Slide

  8. Applications deployment

    View Slide

  9. App. Execution Technologies
    • How things changed over time
    • Bare Metal (non virtualized)
    Buy or Rent the HW and install everything from scratch. Single
    Tenant
    • Virtual Machines
    Spin up a virtual server along with others running on an existing,
    virtualized server. Looks like a dedicated machine
    • Containers
    Run a single (?) application alongside others on an existing
    environment
    T
    I
    M
    E

    View Slide

  10. Bare Metal (no virt.)
    • Pros
    • Fast (no virt. layer)
    • Single tenant -> Secure
    • No “noisy neighbour” effect -> more stability
    • Cons
    • Limited flexibility & scalability
    • billing, re-installs
    • Dimensioning is hard -> over-provision
    • More work for Operations

    View Slide

  11. Virtual Machines
    • Pros
    • Flexibility (Start, Stop, Snapshot, Template)
    • Fragmentation and resource usage
    • Reduces costs
    • Cons
    • Virtualization layer overhead
    • Noisy neighbour effect
    • Single failure impacts many services

    View Slide

  12. What is a Container

    What is a container then?

    Type of virtualization that happens at the
    operating system level

    Applications can run in an isolated user
    spaces called “Containers”

    Implemented at the kernel level, multiple
    containers share the same Kernel

    View Slide

  13. Containers vs VMs

    How does it compare to VMs?

    View Slide

  14. Containers
    • Pros
    • Application is bundled with dependencies
    • Shared Kernel (lower resource)
    • Easy installation and migration
    • High density with isolation
    • Startup time
    • Cons
    • Steep learning curve
    • Existing apps must be “containerized” first
    • Shared Kernel (security?)

    View Slide

  15. Kubernetes

    View Slide


  16. Platform to manage containers and services

    Originally Developed by Google based on
    their experience with Borg

    Manage a set of nodes to provide you with a
    platform to run the containers

    Helps you manage and scale your applications
    What is Kubernetes

    View Slide

  17. Why is it relevant?

    Traditional deployments

    No resource boundaries → some applications
    starve for resources

    Can’t easily reallocate resources after the initial
    setup

    Virtual Machines

    Multiple VMs on the same server → better
    resource utilization

    Better isolation

    Each VM has a copy of the OS

    View Slide

  18. Why is it relevant?

    Containers

    shared kernel with isolated userspace

    each container has its own filesystem, a share of
    CPU cores

    decoupled from they underlying infrastructure →
    portable across distributions and cloud providers


    View Slide

  19. Why is it relevant?



    fast image creation and easy rollback
    compared to VMs → Good fit for frequent
    deployments and CI/CD

    separation of concerns between Devs and
    Ops

    consistency across development in multiple
    environments

    View Slide

  20. Why Kubernetes?

    View Slide

  21. Why Kubernetes?

    Manage the containers that run your
    applications in production with no downtime

    Takes care of running your application
    containers on a distributed system

    Scaling application and the nodes cluster and
    failover

    View Slide


  22. Also

    Service discovery and load balancing

    Storage Abstraction mount storage of
    choice)

    Auto rollouts and rollbacks. You describe
    the desired state for your containers

    Self Healing with restart failing containers,
    and probes, ..

    Secrets management. Externalize sensitive
    and env. specific info
    Why Kubernetes?

    View Slide

  23. How does it compare to..

    There are other orchestrators and tools
    available to manage containers

    Docker Compose

    allows you to define services as collections but
    that is pretty much it

    Docker Swarm

    gives you to work on a distributed environment

    services definition and commands are somewhat
    similar to compose

    not as sophisticated (and complex! as K8s)

    View Slide

  24. Containerize GeoServer

    Need a docker image

    Use the ones available!

    GeoSolutions here, sources here

    Official Image available check blogpost here

    Ready to use

    Based on Tomcat images

    Configurable

    View Slide

  25. Example K8s deployment

    View Slide

  26. Helm
    • Package Manager for Kubernetes
    • Use “recipes” available online
    • Deployment unit is called a Chart
    • Charts can easily be published, versioned and
    shared
    • Provide templating for your Manifests
    • GeoSolutions Chart implementation available for
    preview:
    https://github.com/geosolutions-it/charts

    View Slide

  27. Resources

    What is Kubernetes

    Borg: The Predecessor to Kubernetes

    Containerization

    What is a Container

    Docker Compose

    Docker Swarm

    Rancher

    View Slide

  28. Best Practices

    View Slide

  29. Best Practices …
    • Start simple
    • Set up a local environment (Minikube)
    • Use available Images and Charts, don’t reinvent the wheel
    • Use managed K8s
    • Plan ahead
    • Design your infrastructure before moving on with
    deployment
    • Clustering Strategies, Storage Classes , Volumes
    sizing
    • Implement a Logging and a monitoring solution

    View Slide

  30. … Best Practices
    • Choose the right nodes
    • No need for tons of RAM
    • Compute optimized node perform better (~4
    CPUs/instance)

    View Slide

  31. Storage Classes

    View Slide

  32. Choose Storage - Block Storage
    • Local or Block Storage
    • Local
    • Non Shared
    • Can fail
    • Low latency
    • Good fit for temporary
    storing Logs and Audits
    • Cached Tiles?

    View Slide

  33. Choose Storage - File Share
    • File Share
    • File Storage ~ NFS
    • Shared (locally)
    • Scales well
    • Typical use case is GeoServer
    datadirs
    • Spatial Data
    • Cached tiles?

    View Slide

  34. Choose Storage - Blob Storage
    • Blob Storage
    • Max scalability
    • Elastic
    • Cheap
    • High Latency
    • Shared
    • Robust
    • Cached Tiles?

    View Slide

  35. Use Cases

    View Slide

  36. EO Data Dissemination

    Earth Observation

    Meteorological and Oceanographic data

    Continuous data ingestion flordata

    TBs of data growing over time. Raster and Vector

    View Slide

  37. EO Data Dissemination

    View Slide

  38. Observability and Debugging

    View Slide

  39. Monitoring and Logging
    • Can be tricky in Containerized
    environments!
    • Dynamic and distributed
    • Instances are spinning
    up and down
    • distributed across multiple
    nodes
    • Hard to identify problems
    • What can we do?

    View Slide

  40. Centralize and Aggregate
    • Centralize and Aggregate
    • Single central location
    • Easy to navigate and filter
    • New nodes can spawn but also go away
    • You’ll need “shippers” to collect and
    send out logs to the central service

    View Slide

  41. Collect Metrics
    • Collect Metrics
    • Resp Time,
    • Throughput,
    • Uptime,
    • Error Rate
    • ...

    View Slide

  42. Auditing
    • GeoServer can produce audit files for
    you
    • Monitoring extension
    • Tracks requests made to GeoServer
    • Collect and ingest them to create pretty
    Dashboards

    View Slide

  43. Auditing
    • Audit Event

    View Slide

  44. Analyze performance of your Cluster

    View Slide

  45. Analyze performance of your Cluster

    View Slide

  46. Pointers
    • GeoSolutions Website
    https://geosolutionsgroup.com/
    • GeoServer on Kubernetes
    https://www.geosolutionsgroup.com/blog/devops-k8
    s/
    • A DevOps perspective on GeoServer
    https://www.geosolutionsgroup.com/blog/devops-geoser
    ver-monitoring-metering/
    • Cloud Optimized GeoTiffs
    https://www.cogeo.org/
    https://docs.geoserver.org/latest/en/user/community
    /cog/index.html
    • Helm
    https://helm.sh/

    View Slide

  47. View Slide