Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deployment of Streaming Application with Verver...

Deployment of Streaming Application with Ververica Platform on Kubernetes

Kubernetes Frankfurt

Alexey Novakov

July 13, 2023
Tweet

More Decks by Alexey Novakov

Other Decks in Programming

Transcript

  1. Alexey Novakov Product Solution Architect @ Ververica, Germany - Last

    6 years working in data management - 17 years in Software Development - Distributed Systems, Big/Small/Fast Data - Astronomy, Music
  2. Contents 01 Getting started with Apache Flink and VVP 02

    Flink Application Lifecycle 03 VVP Integrations 04 Summary
  3. Original Creators of Apache Flink® Enterprise Stream Processing with Ververica

    Platform Subsidiary of Alibaba Group About Ververica
  4. Features: - High-Availability, Incremental checkpointing - Sophisticated late data handling

    - Low latency, High throughput - Scala, Java, SQL and Python APIs - …. and much more Apache Flink
  5. What is Ververica Platform (VVP)? Purpose-built for stateful stream processing

    architectures and makes operating these powerful systems easier than ever before by offering an entirely new experience for developing, deploying, and managing stream processing applications. Ververica’s mission is to ensure that developers invest their time on their core business objectives, not on maintenance and infrastructure.
  6. VVP Pre-requirements Bring-your-own Kubernetes - From Cloud Providers: - AWS

    EKS - Azure AKS - Google GKE - On-prem cluster, OpenShift - Local development: minikube, k3s
  7. VVP Helm Package $ helm repo add ververica https://charts.ververica.com $

    helm -ns vvp \ install vvp ververica/ververica-platform \ --values values-vvp.yaml *see more https://docs.ververica.com/getting_started/installation.html#setting-up-the-playground
  8. VVP Control Plane $ kubectl get pod -n vvp -l

    app=vvp-ververica-platform NAME READY STATUS RESTARTS AGE vvp-ververica-platform-75c54fcd6d-95wgh 3/3 Running 0 1m Now it is ready to run Flink applications
  9. User Workflow in VVP New App Deployment Reconfigure, if needed

    Start Deployment Upload Flink JAR / Python Script Create Deployment Monitor Create SQL Script /
  10. @main def FraudDetectionJob = val env = StreamExecutionEnvironment.getExecutionEnvironment val transactions

    = env .addSource(TransactionsSource.iterator) .name("transactions") val alerts = transactions .keyBy(_.accountId) .process(FraudDetector()) .name("fraud-detector") alerts .addSink(AlertSink()) .name("send-alerts") env.execute("Fraud Detection") Step 0: Build JAR file some fraud detection logic print to console case class Transaction( accountId: Long, timestamp: Long, amount: Double ) Full source code: https://github.com/novakov-alexey/flink-sandbox
  11. Step 2: Create New Deployment - Option 2 YAML Can

    be submitted via REST API also. (K8s CRD is coming soon)
  12. Step 2: Create New Deployment - Option 3 REST API

    POST vvp-resources/deployment_target.yaml to VVP REST API to create the Deployment Target: Afterwards, you can POST vvp-resources/deployment.yaml to the REST API to create the Deployment: $ curl localhost:8080/api/v1/namespaces/default/deployment-targets \ -X POST \ -H "Content-Type: application/yaml" \ -H "Accept: application/yaml" \ --data-binary @vvp-resources/deployment_target.yaml $ curl localhost:8080/api/v1/namespaces/default/deployments \ -X POST \ -H "Content-Type: application/yaml" \ -H "Accept: application/yaml" \ --data-binary @vvp-resources/deployment.yaml
  13. Save state for another Deployment Create new Deployment with the

    same configuration Monitor running Job Step 4: Manage Deployment
  14. VVP Universal Blob Storage • Minio is one of the

    option for Universal Blob Storage • Minio is useful at the development phase • Blob Storage is used for: - Code Artifacts - Flink checkpoints & savepoints https://docs.ververica.com/getting_started/installation.html#setting-up-the-playground
  15. Supportted Storage Supported Storage Services AWS S3 s3:// Microsoft ABS

    wasbs:// Apache Hadoop® HDFS hdfs:// Google GCS gs:// Alibaba OSS oss:// Microsoft ABS Workload Identity wiaz:// values.yaml: ### Configure Minio for Universal Blob Storage vvp: blobStorage: baseUri: s3://vvp s3: endpoint: http://minio.vvp.svc:9000
  16. Access Control Authenticaiton • OpenID Connect • SAML Authorization •

    Roles: viewer, editor, owner, admin Resource Viewer Editor Owner Admin Artifacts List, GetMetadata All All None ApiToken None None All None DeploymentDefaults Get Get All None DeploymentTarget List, Get List, Get All None Apache Flink® UI Get* All All None Namespace None None Get, Update All SecretValue None All All None All Others List, Get All All None
  17. Metrics Ververica Platform bundles metrics reporters for: • Prometheus •

    InfluxDB • Datadog • Prometheus Pushgateway • Graphite • StatsD • SLF4J spec: template: spec: flinkConfiguration: metrics.reporters: prometheus metrics.reporter.prometheus.class: org.apache.flink.metrics.prometheus.PrometheusReporter metrics.reporter.prometheus.port: 9249
  18. • Stability: Well-tested pre-configured Flink runtime • Flexibility: Deployable to

    your own K8s cluster • Integration: most popular 3-rd party systems are integrated (S3, AD, OAuth2, etc). • Easier Operations: practical UI and REST API. Soon K8s Operator and CRD VVP Benefits
  19. 34 Ververica Cloud Ververica Cloud (beta) Ultra-high performance cloud- native

    service for real-time data processing based on Apache Flink. Sign up for free If you do not want to run yourselves on K8s, then try …