Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Conf42-LLM_Adding Generative AI to Real-Time St...

Timothy Spann
April 15, 2024
74

Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines

Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines

Timothy Spann

April 15, 2024
Tweet

Transcript

  1. © 2023 Cloudera, Inc. All rights reserved. Adding Generative AI

    to Real-Time Streaming Pipelines Tim Spann Principal Developer Advocate April 2024
  2. © 2023 Cloudera, Inc. All rights reserved. 3 Tim Spann

    Twitter: @PaasDev // Blog: datainmotion.dev Principal Developer Advocate. Field Engineer. Princeton/NYC Future of Data Meetups. ex-Pivotal, ex-Hortonworks, ex-StreamNative, ex-PwC https://medium.com/@tspann https://github.com/tspannhw
  3. © 2023 Cloudera, Inc. All rights reserved. 4 This week

    in Apache NiFi, Apache Flink, Apache Kafka, ML, AI, Apache Spark, Apache Iceberg, Python, Java, LLM, GenAI, Vector DB and Open Source friends. https://bit.ly/32dAJft https://www.meetup.com/futureofdata- princeton/ FLaNK Stack Weekly by Tim Spann
  4. © 2023 Cloudera, Inc. All rights reserved. 5 Confidential—Restricted @PaasDev

    https://www.meetup.com/futureofdata-princeton/ From Big Data to AI to Streaming to Containers to Cloud to Analytics to Cloud Storage to Fast Data to Machine Learning to Microservices to ... Future of Data - NYC + NJ + Philly + Virtual
  5. © 2023 Cloudera, Inc. All rights reserved. 6 Some common

    Vector DBs Open Community & Open Models RAPID INNOVATION IN THE LLM SPACE Too much to cover today.. but you should know the common LLMs, Frameworks, Tools Notable LLMs Closed Models Open Models GPT3.5 GPT4 Llama2 Mistral7B Mixtral8x7B Claude2 ++ 100s more… check out the HuggingFace LLM Leaderboard (pretrained, domain fine-tuned, chat models, …) Code Llama Popular LLM Frameworks When to use one over the other? Use Langchain if you need a general-purpose framework with flexibility and extensibility. Consider LlamaIndex if you’re building a RAG only app (retrieval/search) Langchain is a framework for developing apps powered by LLMs • Python and JavaScript Libraries • Provides modules for LLM Interface, Retrieval, & Agents LLamaIndex is a framework designed specifically for RAG apps • Python and JavaScript Libraries • Provides built in optimizations / techniques for advanced RAG HuggingFace is an ML community for hosting & collaborating on models, datasets, and ML applications • Latest open source LLMs are in HuggingFace • + great learning resources / demos https://huggingface.co/ Open Source vs Self Hosted vs SaaS option
  6. © 2023 Cloudera, Inc. All rights reserved. 7 Enterprise Knowledge

    Base / Chatbot / Q&A - Customer Support & Troubleshooting - Enable open ended conversations with user provided prompts Code assistant: - Provide relevant snippets of code as a response to a request written in natural language. - Assist with creating test cases and synthetic test data. - Reference other relevant data such as a company’s documentation to help provide more accurate responses. Social and emotional sensing - Gauge emotions and opinions based on a piece of text. - Understand and deliver a more nuanced message back based on sentiment. ENTERPRISE WIDE USE CASES FOR AN LLM Classification and Clustering - Categorize and sort large volumes of data into common themes and trends to support more informed decision making. Language Translation - Globalize your content by feeding web pages through LLMs for translation. - Combine with chatbots to provide multilingual support to your customer base. Document Summarization - Distill large amounts of text down to the most relevant points. Content Generation - Provide detailed and contextually relevant prompts to develop outlines, brainstorm ideas and approaches for content. L Adoption dependent upon an Enterprise’s risk tolerance, restrictions, decision rights and disclosure obligations.
  7. © 2023 Cloudera, Inc. All rights reserved. 8 Which Model

    and When? Use the right model for right job: closed or open-source Closed Source Usage can easily scale but so can your costs Rapidly improving AI models Most advanced AI models Excel at more specialized tasks Great for a wide range of tasks Open Source Better cost planning Compliance, privacy, and security risks More control over where & how models are deployed
  8. © 2023 Cloudera, Inc. All rights reserved. 9 ECOSYSTEM PARTNERSHIPS

    Best of breed capabilities for best in class Enterprise AI RAY COMPUTE • Tune, manage, scale AI models and applications • Integrated into CML Sessions FOUNDATION • Widest range of Foundation Models • Serverless integration with CDP for fast time to value PERFORMANCE • Optimized GPU performance & accelerated data science pipelines SEARCH • Cloud-based semantic search made easy and at scale • Store and manage AI representations of data in the public cloud TOOLING • Access to open source innovation through CML AMPs • Embedded into CML (Model Registry & Serving)
  9. © 2023 Cloudera, Inc. All rights reserved. 10 APPLICATIONS CLOSED-SOURCE

    FOUNDATION MODELS MODEL HUBS OPEN SOURCE FOUNDATION MODELS FINE-TUNED MODELS PRIVATE VECTOR STORE MANAGED VECTOR STORE CLOUD INFRASTRUCTURE Milvus, Solr* Meta (Llama 2) Applied Machine Learning Prototypes (AMPs) Cloudera Generative AI Stack Hugging Face Pinecone SPECIALIZED HARDWARE APIs: OpenAI (GPT-4 Turbo) Amazon Bedrock: Anthropic (Claude 2), Cohere… DATA WRANGLING REAL-TIME DATA INGEST & ROUTING AI MODEL TRAINING & INFERENCE DATA STORE & VISUALIZATION Open Data Lakehouse DATA WRANGLING REAL-TIME DATA INGEST & ROUTING AI MODEL TRAINING & SERVING DATA STORE & VISUALIZATION
  10. © 2023 Cloudera, Inc. All rights reserved. 11 Live Q&A

    Travel Advisories Weather Reports Documents Social Media Databases Transactions Public Data Feeds S3 / Files Logs ATM Data Live Chat … HYBRID CLOUD INTERACT COLLECT STORE ENRICH, REPORT Distribute Collect Report REPORT Visualize Report, Automate AI BASED ENHANCEMENTS Predict, Automate VECTOR DATABASE LLM Machine Learning Data Visualization Data Flow Data Warehouse SQL Stream Builder Data Visualization Input Sentences Generated Text Timestamp Input Sentence Timestamps Enrichments Messaging Broker Real-time alerting Real-time alerting Aggregations
  11. © 2019 Cloudera, Inc. All rights reserved. 13 Cloudera +

    LLMs Knowledge Repository Data Storage / Management Data Preparation Data Engineering LLM Fine Tuning Process Training Framework LLM Serving Serving Framework Key: CPU Task GPU Task CML CDE CDP Vector DB CDF Streaming Classification Real-Time Model Deployment
  12. © 2023 Cloudera, Inc. All rights reserved. 16 DataFlow Pipelines

    Can Help External Context Ingest Ingesting, routing, clean, enrich, transforming, parsing, chunking and vectorizing structured, unstructured, semistructured, binary data and documents Prompt engineering Crafting and structuring queries to optimize LLM responses Context Retrieval Enhancing LLM with external context such as Retrieval Augmented Generation (RAG) Roundtrip Interface Act as a Discord, REST, Kafka, SQL, Slack bot to roundtrip discussions
  13. © 2019 Cloudera, Inc. All rights reserved. 17 UNSTRUCTURED DATA

    WITH NIFI • Archives - tar, gzipped, zipped, … • Images - PNG, JPG, GIF, BMP, … • Documents - HTML, Markdown, RSS, PDF, Doc, RTF, Plain Text, … • Videos - MP4, Clips, Mov, Youtube URL… • Sound - MP3, … • Social / Chat - Slack, Discord, Twitter, REST, Email, … • Identify Mime Types, Chunk Documents, Store to Vector Database • Parse Documents - HTML, Markdown, PDF, Word, Excel, Powerpoint
  14. © 2019 Cloudera, Inc. All rights reserved. 18 CLOUD ML/DL/AI/Vector

    Database Services • Cloudera ML • Amazon Polly, Translate, Textract, Transcribe, Bedrock, … • Hugging Face • IBM Watson X.AI • Vector Stores Anywhere: Weaviate, Pinecone, Milvus, Chroma DB, SOLR, …
  15. https://medium.com/cloudera-inc/getting-ready-for-apache-nifi-2-0-5a5e6a67f450 NiFi 2.0.0 Features • Python Integration • Parameters •

    JDK 21+ • JSON Flow Serialization • Rules Engine for Development Assistance • Run Process Group as Stateless • flow.json.gz https://cwiki.apache.org/confluence/display/NIFI/NiFi+2.0+Release+Goals
  16. © 2023 Cloudera, Inc. All rights reserved. 20 FLINK SQL

    -> CLOUDERA MACHINE LEARNING MODELS
  17. © 2023 Cloudera, Inc. All rights reserved. 21 FLINK SQL

    -> NIFI -> HUGGING FACE GOOGLE GEMINI
  18. © 2023 Cloudera, Inc. All rights reserved. 22 SSB UDF

    JS/JAVA + GenAI = Real-Time GenAI SQL https://medium.com/cloudera-inc/adding-generative-ai-results-to-sql-streams-513e1fd2a6af SELECT CALLLLM(CAST(messagetext as STRING)) as generatedtext, messagerealname, messageusername, messagetext,messageusertz, messageid, threadts, ts FROM flankslackmessages WHERE messagetype = 'message'
  19. Extract Text from Web VTT • Python 3.10+ • Web

    VTT to Text • Web Video Text Tracks Format Extractor https://developer.mozilla.org/en-US/docs/Web/API/WebVTT_API https://github.com/tspannhw/FLaNK-python-processors/blob/main/TranslateWebVTT.py WEBVTT 1 00:00:06.066 --> 00:00:07.166 Now let's talk about 2 00:00:07.166 --> 00:00:12.033 data retrieval, views, and materialized views.
  20. WatsonX SDK To Foundation • Python 3.10+ • LLM •

    WatsonX.AI Foundation Models • Inference • Secure • Official SDK from IBM https://github.com/tspannhw/FLaNK-python-watsonx-processor
  21. Generate Synthetic Records w/ Faker • Python 3.10+ • faker

    • Choose as many as you want • Attribute output
  22. Download a Wiki Page as HTML or WikiFormat (Text) •

    Python 3.10+ • Wikipedia-api • HTML or Text • Choose your wiki page dynamically
  23. Extract Company Names • Python 3.10+ • Hugging Face, NLP,

    SpaCY, PyTorch https://github.com/tspannhw/FLaNK-python-ExtractCompanyName-processor
  24. CaptionImage • Python 3.10+ • Hugging Face • Salesforce/blip-image-captioning-large •

    Generate Captions for Images • Adds captions to FlowFile Attributes • Does not require download or copies of your images https://github.com/tspannhw/FLaNK-python-processors
  25. RESNetImageClassification • Python 3.10+ • Hugging Face • Transformers •

    Pytorch • Datasets • microsoft/resnet-50 • Adds classification label to FlowFile Attributes • Does not require download or copies of your images https://github.com/tspannhw/FLaNK-python-processors
  26. NSFWImageDetection • Python 3.10+ • Hugging Face • Transformers •

    Falconsai/nsfw_image_detection • Adds normal and nsfw to FlowFile Attributes • Gives score on safety of image • Does not require download or copies of your images https://github.com/tspannhw/FLaNK-python-processors
  27. FacialEmotionsImageDetection • Python 3.10+ • Hugging Face • Transformers •

    facial_emotions_image_detection • Image Classification • Adds labels/scores to FlowFile Attributes • Does not require download or copies of your images https://github.com/tspannhw/FLaNK-python-processors
  28. Other Python Processors • Put/Query-Pinecone (Vector DB Interface) • ChunkDocument,

    ParseDocument • ConvertCSVtoExcel • DetectObjectInImage • PromptChatGPT • Put/Query-Chroma (Vector DB Interface)
  29. MORE ARTICLES • https://medium.com/cloudera-inc/watching-airport-traffic-in-real-time-32c522a6e386 • https://medium.com/cloudera-inc/building-a-real-time-data-pipeline-a-comprehensive-tutorial-on-min ifi-nifi-kafka-and-flink-ee03ee6722cb • https://medium.com/cloudera-inc/finding-the-best-way-around-7491c76ca4cb •

    https://medium.com/cloudera-inc/nyc-traffic-are-you-kidding-me-6d3fa853903b • https://medium.com/@tspann/building-a-travel-advisory-app-with-apache-nifi-in-k8-969b44c84958 • https://medium.com/@tspann/using-ollama-with-mistral-and-apache-nifi-720c17f5ff12 • https://medium.com/cloudera-inc/google-gemma-for-real-time-lightweight-open-llm-inference-88efe 98e580f • https://medium.com/@tspann/image-processing-with-custom-python-and-nifi-2-0-06eadc62c03c • https://medium.com/@tspann/ai-augmented-devrel-part-1-4058af905a89 • https://medium.com/cloudera-inc/mixtral-generative-sparse-mixture-of-experts-in-dataflows-59744f 7d28a9 • https://medium.com/@tspann/building-an-llm-bot-for-meetups-and-conference-interactivity-c211ea 6e3b61 • https://medium.com/@tspann/yet-another-python-processor-45aaae6fe406