Slide 1

Slide 1 text

© 2023 Cloudera, Inc. All rights reserved. Adding Generative AI to Real-Time Streaming Pipelines Tim Spann Principal Developer Advocate April 2024

Slide 2

Slide 2 text

© 2023 Cloudera, Inc. All rights reserved.

Slide 3

Slide 3 text

© 2023 Cloudera, Inc. All rights reserved. 3 Tim Spann Twitter: @PaasDev // Blog: datainmotion.dev Principal Developer Advocate. Field Engineer. Princeton/NYC Future of Data Meetups. ex-Pivotal, ex-Hortonworks, ex-StreamNative, ex-PwC https://medium.com/@tspann https://github.com/tspannhw

Slide 4

Slide 4 text

© 2023 Cloudera, Inc. All rights reserved. 4 This week in Apache NiFi, Apache Flink, Apache Kafka, ML, AI, Apache Spark, Apache Iceberg, Python, Java, LLM, GenAI, Vector DB and Open Source friends. https://bit.ly/32dAJft https://www.meetup.com/futureofdata- princeton/ FLaNK Stack Weekly by Tim Spann

Slide 5

Slide 5 text

© 2023 Cloudera, Inc. All rights reserved. 5 Confidential—Restricted @PaasDev https://www.meetup.com/futureofdata-princeton/ From Big Data to AI to Streaming to Containers to Cloud to Analytics to Cloud Storage to Fast Data to Machine Learning to Microservices to ... Future of Data - NYC + NJ + Philly + Virtual

Slide 6

Slide 6 text

© 2023 Cloudera, Inc. All rights reserved. 6 Some common Vector DBs Open Community & Open Models RAPID INNOVATION IN THE LLM SPACE Too much to cover today.. but you should know the common LLMs, Frameworks, Tools Notable LLMs Closed Models Open Models GPT3.5 GPT4 Llama2 Mistral7B Mixtral8x7B Claude2 ++ 100s more… check out the HuggingFace LLM Leaderboard (pretrained, domain fine-tuned, chat models, …) Code Llama Popular LLM Frameworks When to use one over the other? Use Langchain if you need a general-purpose framework with flexibility and extensibility. Consider LlamaIndex if you’re building a RAG only app (retrieval/search) Langchain is a framework for developing apps powered by LLMs ● Python and JavaScript Libraries ● Provides modules for LLM Interface, Retrieval, & Agents LLamaIndex is a framework designed specifically for RAG apps ● Python and JavaScript Libraries ● Provides built in optimizations / techniques for advanced RAG HuggingFace is an ML community for hosting & collaborating on models, datasets, and ML applications ● Latest open source LLMs are in HuggingFace ● + great learning resources / demos https://huggingface.co/ Open Source vs Self Hosted vs SaaS option

Slide 7

Slide 7 text

© 2023 Cloudera, Inc. All rights reserved. 7 Enterprise Knowledge Base / Chatbot / Q&A - Customer Support & Troubleshooting - Enable open ended conversations with user provided prompts Code assistant: - Provide relevant snippets of code as a response to a request written in natural language. - Assist with creating test cases and synthetic test data. - Reference other relevant data such as a company’s documentation to help provide more accurate responses. Social and emotional sensing - Gauge emotions and opinions based on a piece of text. - Understand and deliver a more nuanced message back based on sentiment. ENTERPRISE WIDE USE CASES FOR AN LLM Classification and Clustering - Categorize and sort large volumes of data into common themes and trends to support more informed decision making. Language Translation - Globalize your content by feeding web pages through LLMs for translation. - Combine with chatbots to provide multilingual support to your customer base. Document Summarization - Distill large amounts of text down to the most relevant points. Content Generation - Provide detailed and contextually relevant prompts to develop outlines, brainstorm ideas and approaches for content. L Adoption dependent upon an Enterprise’s risk tolerance, restrictions, decision rights and disclosure obligations.

Slide 8

Slide 8 text

© 2023 Cloudera, Inc. All rights reserved. 8 Which Model and When? Use the right model for right job: closed or open-source Closed Source Usage can easily scale but so can your costs Rapidly improving AI models Most advanced AI models Excel at more specialized tasks Great for a wide range of tasks Open Source Better cost planning Compliance, privacy, and security risks More control over where & how models are deployed

Slide 9

Slide 9 text

© 2023 Cloudera, Inc. All rights reserved. 9 ECOSYSTEM PARTNERSHIPS Best of breed capabilities for best in class Enterprise AI RAY COMPUTE ● Tune, manage, scale AI models and applications ● Integrated into CML Sessions FOUNDATION ● Widest range of Foundation Models ● Serverless integration with CDP for fast time to value PERFORMANCE ● Optimized GPU performance & accelerated data science pipelines SEARCH ● Cloud-based semantic search made easy and at scale ● Store and manage AI representations of data in the public cloud TOOLING ● Access to open source innovation through CML AMPs ● Embedded into CML (Model Registry & Serving)

Slide 10

Slide 10 text

© 2023 Cloudera, Inc. All rights reserved. 10 APPLICATIONS CLOSED-SOURCE FOUNDATION MODELS MODEL HUBS OPEN SOURCE FOUNDATION MODELS FINE-TUNED MODELS PRIVATE VECTOR STORE MANAGED VECTOR STORE CLOUD INFRASTRUCTURE Milvus, Solr* Meta (Llama 2) Applied Machine Learning Prototypes (AMPs) Cloudera Generative AI Stack Hugging Face Pinecone SPECIALIZED HARDWARE APIs: OpenAI (GPT-4 Turbo) Amazon Bedrock: Anthropic (Claude 2), Cohere… DATA WRANGLING REAL-TIME DATA INGEST & ROUTING AI MODEL TRAINING & INFERENCE DATA STORE & VISUALIZATION Open Data Lakehouse DATA WRANGLING REAL-TIME DATA INGEST & ROUTING AI MODEL TRAINING & SERVING DATA STORE & VISUALIZATION

Slide 11

Slide 11 text

© 2023 Cloudera, Inc. All rights reserved. 11 Live Q&A Travel Advisories Weather Reports Documents Social Media Databases Transactions Public Data Feeds S3 / Files Logs ATM Data Live Chat … HYBRID CLOUD INTERACT COLLECT STORE ENRICH, REPORT Distribute Collect Report REPORT Visualize Report, Automate AI BASED ENHANCEMENTS Predict, Automate VECTOR DATABASE LLM Machine Learning Data Visualization Data Flow Data Warehouse SQL Stream Builder Data Visualization Input Sentences Generated Text Timestamp Input Sentence Timestamps Enrichments Messaging Broker Real-time alerting Real-time alerting Aggregations

Slide 12

Slide 12 text

© 2023 Cloudera, Inc. All rights reserved. 12

Slide 13

Slide 13 text

© 2019 Cloudera, Inc. All rights reserved. 13 Cloudera + LLMs Knowledge Repository Data Storage / Management Data Preparation Data Engineering LLM Fine Tuning Process Training Framework LLM Serving Serving Framework Key: CPU Task GPU Task CML CDE CDP Vector DB CDF Streaming Classification Real-Time Model Deployment

Slide 14

Slide 14 text

© 2023 Cloudera, Inc. All rights reserved. NLP / AI / LLM Generative AI

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

© 2023 Cloudera, Inc. All rights reserved. 16 DataFlow Pipelines Can Help External Context Ingest Ingesting, routing, clean, enrich, transforming, parsing, chunking and vectorizing structured, unstructured, semistructured, binary data and documents Prompt engineering Crafting and structuring queries to optimize LLM responses Context Retrieval Enhancing LLM with external context such as Retrieval Augmented Generation (RAG) Roundtrip Interface Act as a Discord, REST, Kafka, SQL, Slack bot to roundtrip discussions

Slide 17

Slide 17 text

© 2019 Cloudera, Inc. All rights reserved. 17 UNSTRUCTURED DATA WITH NIFI • Archives - tar, gzipped, zipped, … • Images - PNG, JPG, GIF, BMP, … • Documents - HTML, Markdown, RSS, PDF, Doc, RTF, Plain Text, … • Videos - MP4, Clips, Mov, Youtube URL… • Sound - MP3, … • Social / Chat - Slack, Discord, Twitter, REST, Email, … • Identify Mime Types, Chunk Documents, Store to Vector Database • Parse Documents - HTML, Markdown, PDF, Word, Excel, Powerpoint

Slide 18

Slide 18 text

© 2019 Cloudera, Inc. All rights reserved. 18 CLOUD ML/DL/AI/Vector Database Services • Cloudera ML • Amazon Polly, Translate, Textract, Transcribe, Bedrock, … • Hugging Face • IBM Watson X.AI • Vector Stores Anywhere: Weaviate, Pinecone, Milvus, Chroma DB, SOLR, …

Slide 19

Slide 19 text

https://medium.com/cloudera-inc/getting-ready-for-apache-nifi-2-0-5a5e6a67f450 NiFi 2.0.0 Features ● Python Integration ● Parameters ● JDK 21+ ● JSON Flow Serialization ● Rules Engine for Development Assistance ● Run Process Group as Stateless ● flow.json.gz https://cwiki.apache.org/confluence/display/NIFI/NiFi+2.0+Release+Goals

Slide 20

Slide 20 text

© 2023 Cloudera, Inc. All rights reserved. 20 FLINK SQL -> CLOUDERA MACHINE LEARNING MODELS

Slide 21

Slide 21 text

© 2023 Cloudera, Inc. All rights reserved. 21 FLINK SQL -> NIFI -> HUGGING FACE GOOGLE GEMINI

Slide 22

Slide 22 text

© 2023 Cloudera, Inc. All rights reserved. 22 SSB UDF JS/JAVA + GenAI = Real-Time GenAI SQL https://medium.com/cloudera-inc/adding-generative-ai-results-to-sql-streams-513e1fd2a6af SELECT CALLLLM(CAST(messagetext as STRING)) as generatedtext, messagerealname, messageusername, messagetext,messageusertz, messageid, threadts, ts FROM flankslackmessages WHERE messagetype = 'message'

Slide 23

Slide 23 text

© 2023 Cloudera, Inc. All rights reserved. 23

Slide 24

Slide 24 text

© 2023 Cloudera, Inc. All rights reserved. 24 https://medium.com/cloudera-inc/google-gemma-for-real-time-lightweight-open-llm-infe rence-88efe98e580f

Slide 25

Slide 25 text

Python Processors

Slide 26

Slide 26 text

Basics

Slide 27

Slide 27 text

Basics

Slide 28

Slide 28 text

Basics

Slide 29

Slide 29 text

Extract Text from Web VTT ● Python 3.10+ ● Web VTT to Text ● Web Video Text Tracks Format Extractor https://developer.mozilla.org/en-US/docs/Web/API/WebVTT_API https://github.com/tspannhw/FLaNK-python-processors/blob/main/TranslateWebVTT.py WEBVTT 1 00:00:06.066 --> 00:00:07.166 Now let's talk about 2 00:00:07.166 --> 00:00:12.033 data retrieval, views, and materialized views.

Slide 30

Slide 30 text

WatsonX SDK To Foundation ● Python 3.10+ ● LLM ● WatsonX.AI Foundation Models ● Inference ● Secure ● Official SDK from IBM https://github.com/tspannhw/FLaNK-python-watsonx-processor

Slide 31

Slide 31 text

Generate Synthetic Records w/ Faker ● Python 3.10+ ● faker ● Choose as many as you want ● Attribute output

Slide 32

Slide 32 text

Download a Wiki Page as HTML or WikiFormat (Text) ● Python 3.10+ ● Wikipedia-api ● HTML or Text ● Choose your wiki page dynamically

Slide 33

Slide 33 text

Extract Company Names ● Python 3.10+ ● Hugging Face, NLP, SpaCY, PyTorch https://github.com/tspannhw/FLaNK-python-ExtractCompanyName-processor

Slide 34

Slide 34 text

CaptionImage ● Python 3.10+ ● Hugging Face ● Salesforce/blip-image-captioning-large ● Generate Captions for Images ● Adds captions to FlowFile Attributes ● Does not require download or copies of your images https://github.com/tspannhw/FLaNK-python-processors

Slide 35

Slide 35 text

RESNetImageClassification ● Python 3.10+ ● Hugging Face ● Transformers ● Pytorch ● Datasets ● microsoft/resnet-50 ● Adds classification label to FlowFile Attributes ● Does not require download or copies of your images https://github.com/tspannhw/FLaNK-python-processors

Slide 36

Slide 36 text

NSFWImageDetection ● Python 3.10+ ● Hugging Face ● Transformers ● Falconsai/nsfw_image_detection ● Adds normal and nsfw to FlowFile Attributes ● Gives score on safety of image ● Does not require download or copies of your images https://github.com/tspannhw/FLaNK-python-processors

Slide 37

Slide 37 text

FacialEmotionsImageDetection ● Python 3.10+ ● Hugging Face ● Transformers ● facial_emotions_image_detection ● Image Classification ● Adds labels/scores to FlowFile Attributes ● Does not require download or copies of your images https://github.com/tspannhw/FLaNK-python-processors

Slide 38

Slide 38 text

Other Python Processors ● Put/Query-Pinecone (Vector DB Interface) ● ChunkDocument, ParseDocument ● ConvertCSVtoExcel ● DetectObjectInImage ● PromptChatGPT ● Put/Query-Chroma (Vector DB Interface)

Slide 39

Slide 39 text

DEMO

Slide 40

Slide 40 text

https://medium.com/@tspann/septa-transit-real-time-81082878b485 Philadelphia SEPTA

Slide 41

Slide 41 text

https://medium.com/cloudera-inc/streaming-street-cams-to-yolo-v8-with-python-and-nifi-to-minio-s3-3277e73723ce Street Cameras

Slide 42

Slide 42 text

https://medium.com/cloudera-inc/subways-and-transit-updates-in-real-time-30c104c359ef NYC Subway

Slide 43

Slide 43 text

MORE ARTICLES ● https://medium.com/cloudera-inc/watching-airport-traffic-in-real-time-32c522a6e386 ● https://medium.com/cloudera-inc/building-a-real-time-data-pipeline-a-comprehensive-tutorial-on-min ifi-nifi-kafka-and-flink-ee03ee6722cb ● https://medium.com/cloudera-inc/finding-the-best-way-around-7491c76ca4cb ● https://medium.com/cloudera-inc/nyc-traffic-are-you-kidding-me-6d3fa853903b ● https://medium.com/@tspann/building-a-travel-advisory-app-with-apache-nifi-in-k8-969b44c84958 ● https://medium.com/@tspann/using-ollama-with-mistral-and-apache-nifi-720c17f5ff12 ● https://medium.com/cloudera-inc/google-gemma-for-real-time-lightweight-open-llm-inference-88efe 98e580f ● https://medium.com/@tspann/image-processing-with-custom-python-and-nifi-2-0-06eadc62c03c ● https://medium.com/@tspann/ai-augmented-devrel-part-1-4058af905a89 ● https://medium.com/cloudera-inc/mixtral-generative-sparse-mixture-of-experts-in-dataflows-59744f 7d28a9 ● https://medium.com/@tspann/building-an-llm-bot-for-meetups-and-conference-interactivity-c211ea 6e3b61 ● https://medium.com/@tspann/yet-another-python-processor-45aaae6fe406

Slide 44

Slide 44 text

LLM 2024 Startup Grind AI Max Summit - April 12 NJ

Slide 45

Slide 45 text

45 TH N Y U