Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Vector Database Ingestion with PyFlink on Decod...

Vector Database Ingestion with PyFlink on Decodable @ Flink Forward Berlin 2024

Abstract:

Join us for a live demo featuring a real-time vector database ingestion use case. We demonstrate how to run Apache Flink jobs written in Python using Decodable's latest managed PyFlink offering. The PyFlink job reads records from an operational database (MySQL), transforms them into vector embeddings on the fly, and streams them into MongoDB used as a vector database to support RAG architectures with fresh data.

Repository: https://github.com/decodableco/examples/tree/main/pyflink-vector-embeddings

Hans-Peter Grahsl

October 23, 2024
Tweet

More Decks by Hans-Peter Grahsl

Other Decks in Programming

Transcript

  1. Resource Management Job Control Connector Library Platform and Job Observability

    State Management CLI UI dbt Unified API Flink and Debezium Runtime Cloud Infrastructure Java SQL Python Job Tuning and Optimization Security Engineering Data Catalog Decodable Provided Customer Provided Our focus today! Real-time ETL Powered by Apache Flink® and Debezium