Slide 1

Slide 1 text

DATA & AI BOOTCAMP 2024

Slide 2

Slide 2 text

DATA & AI BOOTCAMP 2024 Add LINE Group

Slide 3

Slide 3 text

DATA & AI BOOTCAMP 2024 JOIN WIFI WIFI Name: Dee Dar Bar Guese “ตอเลย ฟรี ไมมี Password”

Slide 4

Slide 4 text

DATA & AI BOOTCAMP 2024 Course Outline Data Foundation ● Modern Data Stack ○ Docker สําหรับจัดการ Environment ○ Apache Airflow สําหรับออเคสตรา Data Pipeline ● Database System ○ MongoDB (NoSQL) สําหรับขอมูล Semi-structured ○ PostgreSQL สําหรับขอมูลแบบ Relational ● Google Cloud Platform ○ Cloud Storage เปน Data Lake ○ BigQuery เปน Data Warehouse พรอมใชวิเคราะห

Slide 5

Slide 5 text

DATA & AI BOOTCAMP 2024 Interactive-Session / Active Listener ยกมือถาม ตอบโต้ เรียนรู้ไปด้วยกัน TODAY PRIZE🥇 WE ARE HERE TO SUPPORT 󰗞 FINISH TOGETHER 🚀

Slide 6

Slide 6 text

DATA & AI BOOTCAMP 2024 Git Repository + VS Code https://github.com/wuttichai-hung/data-ai-bootcamp ** หมายเหตุ - แนะนําใหใช Github Codespace ในการทํา Lab ทั้งหมด แตสําหรับทานที่ใช Macbook หรือมี VS Code และ Docker ในเครื่องอยูแลวสามารถที่จะใชงานใน Local Enviroment ของทานได 1. กด Fork Repository

Slide 7

Slide 7 text

DATA & AI BOOTCAMP 2024 Git Repository + VS Code 2. กด create Fork Repository เพื่อ นํา code ของ bootcamp มาไวใน Github ของเรา

Slide 8

Slide 8 text

DATA & AI BOOTCAMP 2024 Git Repository + VS Code 3. ปุม Sync Fork ใชเพื่อ check วา code ของเราที่ดึงมาเปน version ลาสุดที่ตรงกับทีมสอน หากเปน version ลาสุดจะแสดงเครื่องหมายถูก

Slide 9

Slide 9 text

DATA & AI BOOTCAMP 2024 Git Repository + VS Code ในกรณีที่ทีมสอนมีการ Update Code เรา สามารถกด Update branch เพื่อใหได code ใหมลาสุดได

Slide 10

Slide 10 text

DATA & AI BOOTCAMP 2024 Start in Codespace 4. กด code > Open in codespace

Slide 11

Slide 11 text

DATA & AI BOOTCAMP 2024 Start in Codespace Welcome to Codespace!! Now You’re Ready to Rock the Bootcamp

Slide 12

Slide 12 text

DATA & AI BOOTCAMP 2024 Understanding Basic Data Foundation

Slide 13

Slide 13 text

DATA & AI BOOTCAMP 2024 Why data is important to Business GDP = C+I+G+(X-M) C = Consumption การบริโภคของบริษัทและ ประชาชนทั่วไป I = Investment การลงทุนจากภาคเอกชนในการทํา กิจกรรมตางๆในระบบเศรษฐกิจ G = Government Spending คาใชจายของรัฐบาล/ การลงทุนภาครัฐ X - M = Export - Import ตัวเลขการสงออกลบดวย การนําเขาถึงจะเห็นอัตราการบริโภคสุดทายที่แทจริง

Slide 14

Slide 14 text

DATA & AI BOOTCAMP 2024 Why data is important to Business

Slide 15

Slide 15 text

DATA & AI BOOTCAMP 2024 Why data is important to Business https://www.deloitte.com/content/dam/assets-shared/legacy/docs/analysis/2022/dttl-analytics-analytics-advantage-report.pdf

Slide 16

Slide 16 text

DATA & AI BOOTCAMP 2024 Who oversees analytics initiative https://www.deloitte.com/content/dam/assets-shared/legacy/docs/analysis/2022/dttl-analytics-analytics-advantage-report.pdf

Slide 17

Slide 17 text

DATA & AI BOOTCAMP 2024 The 4 levels of data maturity how data mature is your business? https://www.edq.com/blog/data-maturity-how-mature-are-you/

Slide 18

Slide 18 text

DATA & AI BOOTCAMP 2024 Main Type of Data Source

Slide 19

Slide 19 text

DATA & AI BOOTCAMP 2024 Main Type of Data Source

Slide 20

Slide 20 text

DATA & AI BOOTCAMP 2024 Data Classification

Slide 21

Slide 21 text

DATA & AI BOOTCAMP 2024 Roles & Responsibility in Data Career ● Software Engineer ● System Engineer ● Data Engineer ● Data Analyst ● Data Scientist ● Analytics Engineer

Slide 22

Slide 22 text

DATA & AI BOOTCAMP 2024 Extract Transform Load (ETL)

Slide 23

Slide 23 text

DATA & AI BOOTCAMP 2024 Basic Python & SQL

Slide 24

Slide 24 text

DATA & AI BOOTCAMP 2024 Docker (Containerize an application)

Slide 25

Slide 25 text

DATA & AI BOOTCAMP 2024 Why Docker? ● Isolation ● Lightweight ● Simplicity ● Workflow ● Community

Slide 26

Slide 26 text

DATA & AI BOOTCAMP 2024 Why Docker?

Slide 27

Slide 27 text

DATA & AI BOOTCAMP 2024 Docker vs. VM Use Docker if:Use Docker if: ● You need lightweight, scalable solutions. ● You are working with microservices or cloud-native applications. ● Consistency across environments is crucial. Use Virtual Machines if: ● You need to run multiple OS environments. ● Applications require complete isolation. ● Legacy applications are involved that demand dedicated OS resources. https://k21academy.com/docker-kubernetes/docker-vs-virtual-machine/

Slide 28

Slide 28 text

DATA & AI BOOTCAMP 2024 Postgres

Slide 29

Slide 29 text

DATA & AI BOOTCAMP 2024 Mongodb

Slide 30

Slide 30 text

DATA & AI BOOTCAMP 2024 Quiz Time!!

Slide 31

Slide 31 text

DATA & AI BOOTCAMP 2024 Apache Airflow

Slide 32

Slide 32 text

DATA & AI BOOTCAMP 2024 What is Apache Airflow ? Airflow is an open-source platform for developing, scheduling, and monitoring batch-oriented workflows create by Airbnb

Slide 33

Slide 33 text

DATA & AI BOOTCAMP 2024 What is Data Pipeline? Data pipeline is a means of moving data from one place (the source) to a destination (such as a data warehouse). Along the way, data is transformed and optimized, arriving in a state that can be analyzed and used to develop business insights.

Slide 34

Slide 34 text

DATA & AI BOOTCAMP 2024 Traditional Data Pipeline

Slide 35

Slide 35 text

DATA & AI BOOTCAMP 2024 Apache Airflow Features

Slide 36

Slide 36 text

DATA & AI BOOTCAMP 2024 Apache Airflow Features - Pure Python

Slide 37

Slide 37 text

DATA & AI BOOTCAMP 2024 Apache Airflow Features - Robust Integration

Slide 38

Slide 38 text

DATA & AI BOOTCAMP 2024 Apache Airflow Features - Useful UI

Slide 39

Slide 39 text

DATA & AI BOOTCAMP 2024 Apache Airflow Features - Open Source

Slide 40

Slide 40 text

DATA & AI BOOTCAMP 2024 Apache Airflow Features - Easy to Use

Slide 41

Slide 41 text

DATA & AI BOOTCAMP 2024 What is DAG ? D = Direct A = Acyclic G = Graph

Slide 42

Slide 42 text

DATA & AI BOOTCAMP 2024 Pipeline as DAG

Slide 43

Slide 43 text

DATA & AI BOOTCAMP 2024 Example Data pipeline

Slide 44

Slide 44 text

DATA & AI BOOTCAMP 2024 Apache Airflow Core Components

Slide 45

Slide 45 text

DATA & AI BOOTCAMP 2024 Apache Airflow Core Components

Slide 46

Slide 46 text

DATA & AI BOOTCAMP 2024 Airflow Concept to Code ● DAG - the graphical representation of your data pipeline ● Operator - describes a single task in your data pipeline ● Task- an instance of operator task.

Slide 47

Slide 47 text

DATA & AI BOOTCAMP 2024 Dependencies between Tasks In Airflow, we commonly use the binary right shift operator (>>) to define the dependencies

Slide 48

Slide 48 text

DATA & AI BOOTCAMP 2024 Operator in Apache Airflow

Slide 49

Slide 49 text

DATA & AI BOOTCAMP 2024 Operator in Apache Airflow

Slide 50

Slide 50 text

DATA & AI BOOTCAMP 2024 Operator in Apache Airflow

Slide 51

Slide 51 text

DATA & AI BOOTCAMP 2024 Airflow Scheduling

Slide 52

Slide 52 text

DATA & AI BOOTCAMP 2024 Airflow Task State

Slide 53

Slide 53 text

DATA & AI BOOTCAMP 2024 Airflow XCom

Slide 54

Slide 54 text

DATA & AI BOOTCAMP 2024 ● Apache Airflow https://airflow.apache.org/ ● Apache Airflow Best Practices https://airflow.readthedocs.io/en/stable/best-practices.html ● Apache Airflow Guides https://www.astronomer.io/guides/ ● Apache Airflow (YouTube Channel) https://www.youtube.com/channel/UCSXwxpWZQ7XZ1WL3wqevChA ● Data Council (YouTube Channel) https://www.youtube.com/c/DataCouncil/ ● Awesome Apache Airflow https://github.com/jghoman/awesome-apache-airflow Study More

Slide 55

Slide 55 text

DATA & AI BOOTCAMP 2024 Google Cloud Platform

Slide 56

Slide 56 text

DATA & AI BOOTCAMP 2024 Google Cloud Platform https://console.cloud.google.com/home/dashboard?project=dataaibootcamp

Slide 57

Slide 57 text

DATA & AI BOOTCAMP 2024 ขอตกลงรวมกันในการใช GCP Project ● สามารถใช Project สวนตัวได หรือ ใชรวมกันก็ได ● Project นี้จะถูกลบ หลัง Class จบ ● Project นี้ถูกสรางมาเพื่อใหชวยอํานวยความสะดวกใหนักเรียนไมตองเสียเวลาสรางProject เอง ● ขอความกรุณา ไมใช Project นี้สําหรับงานอื่นนอก เนื้อหาการสอนดวยคะ ● ขอความกรุณา ไมสงตอ Service Account ไฟลใหทานอื่นที่ไมใชนักเรียนนะคะ 󰢚 ● Naming for BigQuery DataSet: dataai_NAME_YYYY ○ เชน dataai_beat_1991 ● Naming for GCS Bucket: data-ai-NAME-YYYY ● และหากตองสราง service ใดเพิ่มเติม ○ ให ใช data-ai-NAME-YYYY เปน Prefix

Slide 58

Slide 58 text

DATA & AI BOOTCAMP 2024 Google Cloud Storage

Slide 59

Slide 59 text

DATA & AI BOOTCAMP 2024 Data Lake A Data Lake is a centralized repository that allows organizations to store structured, semi-structured, and unstructured data at any scale. Unlike traditional databases or data warehouses, a data lake can store raw data in its native format until it's needed. This makes it highly flexible for diverse use cases, including data analytics, machine learning, and big data processing.

Slide 60

Slide 60 text

DATA & AI BOOTCAMP 2024 Google Cloud Storage is solution for Data Lake

Slide 61

Slide 61 text

DATA & AI BOOTCAMP 2024 Storage Classes

Slide 62

Slide 62 text

DATA & AI BOOTCAMP 2024 Use case for Google Cloud Storage (GCS) ● Big Data Analytics: Process and analyze large datasets. ● Machine Learning: Provide raw data for training algorithms. ● Data Archiving: Retain historical data for regulatory compliance or future analysis. ● Data Integration: Serve as a single source of truth for disparate data sources.

Slide 63

Slide 63 text

DATA & AI BOOTCAMP 2024 Google BigQuery

Slide 64

Slide 64 text

DATA & AI BOOTCAMP 2024 Warehouse A large building where raw materials or manufactured goods may be stored before their export or distribution for sale.

Slide 65

Slide 65 text

DATA & AI BOOTCAMP 2024 Data Warehouse In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis, and is considered a core component of business intelligence throughout the enterprise. https://en.wikipedia.org/wiki/Data_warehouse

Slide 66

Slide 66 text

DATA & AI BOOTCAMP 2024 Data Warehouse Vs. Data Lake

Slide 67

Slide 67 text

DATA & AI BOOTCAMP 2024 Data Warehouse Vs. Operation Database

Slide 68

Slide 68 text

DATA & AI BOOTCAMP 2024 BigQuery is Google’s Data Warehouse Solution

Slide 69

Slide 69 text

DATA & AI BOOTCAMP 2024 BigQuery BigQuery is a serverless, highly-scalable, and cost-effective cloud data warehouse with an in-memory BI Engine and machine learning built in. ☑ Real-time analytics ☑ Standard SQL ☑ Big data ecosystem integration ☑ Federated query and logical data warehousing ☑ Storage (Colossus) and compute (Dremel) separation ☑ Geospatial data types and functions

Slide 70

Slide 70 text

DATA & AI BOOTCAMP 2024 BigQuery

Slide 71

Slide 71 text

DATA & AI BOOTCAMP 2024 Querying in BigQuery

Slide 72

Slide 72 text

DATA & AI BOOTCAMP 2024 Quiz Time!!

Slide 73

Slide 73 text

DATA & AI BOOTCAMP 2024

Slide 74

Slide 74 text

DATA & AI BOOTCAMP 2024 Course Outline Day 2: AI-Enhanced Data Journey ● Integrate with Data User ○ LINE Integration สําหรับแจงเตือนและติดตาม สถานะ ○ Chat with Data ผาน Vertex AI Agent Generate Data Transform Data Analyse Data Utilize Data Software Engineer Data Engineer Data Analyst/ Analytic Engineer AI Engineer

Slide 75

Slide 75 text

DATA & AI BOOTCAMP 2024 Agenda 1 Line Official Account 2 Create LINE Official Account 3 Workshop LINE Notify Message 4 Workshop : Create LINE Webhook with Python SDK + Cloud Function 3 Vertex AI Agent Builder - RAG 4 Workshop : Create Chat with Data Agent 5 Vertex AI Search + Gemini Image Understanding 6 Workshop : Create Search Agent

Slide 76

Slide 76 text

DATA & AI BOOTCAMP 2024 LINE Chatbot

Slide 77

Slide 77 text

DATA & AI BOOTCAMP 2024 Why LINE Messaging API

Slide 78

Slide 78 text

DATA & AI BOOTCAMP 2024 LINE Official Account

Slide 79

Slide 79 text

DATA & AI BOOTCAMP 2024 LINE Message Events

Slide 80

Slide 80 text

DATA & AI BOOTCAMP 2024 LINE Message types ● Text message ● Sticker message ● Image message ● Video message ● Audio message ● Location message ● Imagemap message ● Template message ● Flex Message

Slide 81

Slide 81 text

DATA & AI BOOTCAMP 2024 LINE Webhook

Slide 82

Slide 82 text

DATA & AI BOOTCAMP 2024 LINE Webhook

Slide 83

Slide 83 text

DATA & AI BOOTCAMP 2024 Validate LINE Signature

Slide 84

Slide 84 text

DATA & AI BOOTCAMP 2024 LINE Webhook

Slide 85

Slide 85 text

DATA & AI BOOTCAMP 2024 Flex Message https://developers.line.biz/flex-simulator/

Slide 86

Slide 86 text

DATA & AI BOOTCAMP 2024 Flex Message

Slide 87

Slide 87 text

DATA & AI BOOTCAMP 2024 Go to Workshop

Slide 88

Slide 88 text

DATA & AI BOOTCAMP 2024 Vertex AI Agent Builder

Slide 89

Slide 89 text

DATA & AI BOOTCAMP 2024 Deterministic vs. Generative

Slide 90

Slide 90 text

DATA & AI BOOTCAMP 2024 Flow-based Agent

Slide 91

Slide 91 text

DATA & AI BOOTCAMP 2024 Flow-based Agent

Slide 92

Slide 92 text

DATA & AI BOOTCAMP 2024 Generative AI

Slide 93

Slide 93 text

DATA & AI BOOTCAMP 2024 Flow-based (Deterministic) vs. Generative-based Agent

Slide 94

Slide 94 text

DATA & AI BOOTCAMP 2024 Retrieval Augmented Generation

Slide 95

Slide 95 text

DATA & AI BOOTCAMP 2024 Vertex AI Agent Builder

Slide 96

Slide 96 text

DATA & AI BOOTCAMP 2024 Conversation Agent Use-case

Slide 97

Slide 97 text

DATA & AI BOOTCAMP 2024 Vertex AI Agent Builder Step

Slide 98

Slide 98 text

DATA & AI BOOTCAMP 2024 Create Conversation Agent

Slide 99

Slide 99 text

DATA & AI BOOTCAMP 2024 Create Data Store

Slide 100

Slide 100 text

DATA & AI BOOTCAMP 2024 Create Data Store

Slide 101

Slide 101 text

DATA & AI BOOTCAMP 2024 Tools

Slide 102

Slide 102 text

DATA & AI BOOTCAMP 2024 Define Agent Goal

Slide 103

Slide 103 text

DATA & AI BOOTCAMP 2024 Integrate with LINE

Slide 104

Slide 104 text

DATA & AI BOOTCAMP 2024 Integrate with LINE

Slide 105

Slide 105 text

DATA & AI BOOTCAMP 2024 Vertex AI Search

Slide 106

Slide 106 text

DATA & AI BOOTCAMP 2024 Vertex AI Search

Slide 107

Slide 107 text

DATA & AI BOOTCAMP 2024 Vertex AI Search 1. Create Search Agent 2. Create Data Store 3. Import Documents 4. Select Large Language Model (LLM) for Search Result Summarization 5. Search Result in both Widget and API

Slide 108

Slide 108 text

DATA & AI BOOTCAMP 2024 Create Vertex AI Search

Slide 109

Slide 109 text

DATA & AI BOOTCAMP 2024 Vertex AI Search

Slide 110

Slide 110 text

DATA & AI BOOTCAMP 2024 Handle Text Message

Slide 111

Slide 111 text

DATA & AI BOOTCAMP 2024 Vertex AI Search - Integration

Slide 112

Slide 112 text

DATA & AI BOOTCAMP 2024 Vertex AI Search

Slide 113

Slide 113 text

DATA & AI BOOTCAMP 2024 Flex Message for Search Results

Slide 114

Slide 114 text

DATA & AI BOOTCAMP 2024 Gemini Image Understanding

Slide 115

Slide 115 text

DATA & AI BOOTCAMP 2024 Image Search Solution

Slide 116

Slide 116 text

DATA & AI BOOTCAMP 2024 Image Search Result

Slide 117

Slide 117 text

DATA & AI BOOTCAMP 2024 Summary

Slide 118

Slide 118 text

DATA & AI BOOTCAMP 2024 https://binariks.com/blog/how-big-data-and-ai-work-together/ Big Data & AI Integration

Slide 119

Slide 119 text

DATA & AI BOOTCAMP 2024 Quiz Time

Slide 120

Slide 120 text

DATA & AI BOOTCAMP 2024

Slide 121

Slide 121 text

DATA & AI BOOTCAMP 2024 Course Outline Day 2: AI-Enhanced Data Journey ● AI-Powered Development ○ ใช Gemini Code Assist เพิ่มความเร็วในการ สราง Pipeline ○ BigQuery Data Canvas วิเคราะหขอมูลดวย AI ● Integrate with Data User ○ LINE Integration สําหรับแจงเตือนและติดตาม สถานะ ○ Chat with Data ผาน Vertex AI Agent

Slide 122

Slide 122 text

DATA & AI BOOTCAMP 2024 Agenda - Github Folder 11-12 1 What is Generative AI 2 How AI code-assisted boost productivity 3 Demo : AI-assisted Dev workflow 4 How Gemini in BigQuery help data analysis 5 Demo : Gemini in BigQuery 6 Summary

Slide 123

Slide 123 text

DATA & AI BOOTCAMP 2024 Gemini Code Assist for Data Engineering Task

Slide 124

Slide 124 text

DATA & AI BOOTCAMP 2024 Discriminative vs. Generative AI

Slide 125

Slide 125 text

DATA & AI BOOTCAMP 2024 Generative AI based on Text Input Data

Slide 126

Slide 126 text

DATA & AI BOOTCAMP 2024 Generative AI & Use-Case Overview

Slide 127

Slide 127 text

DATA & AI BOOTCAMP 2024 AI across the development workflow

Slide 128

Slide 128 text

DATA & AI BOOTCAMP 2024 Example of Today’s Tech Landscape

Slide 129

Slide 129 text

DATA & AI BOOTCAMP 2024 Traditional Bugging Tactics

Slide 130

Slide 130 text

DATA & AI BOOTCAMP 2024 Future Bugging Tactics

Slide 131

Slide 131 text

DATA & AI BOOTCAMP 2024 Hallucination - Challenges in Large Language Model

Slide 132

Slide 132 text

DATA & AI BOOTCAMP 2024 Code Assistants Landscape

Slide 133

Slide 133 text

DATA & AI BOOTCAMP 2024 Why you should consider using an AI coding assistant ?

Slide 134

Slide 134 text

DATA & AI BOOTCAMP 2024 Available for multiple IDEs and developer surfaces

Slide 135

Slide 135 text

DATA & AI BOOTCAMP 2024 Available for multiple IDEs and developer surfaces

Slide 136

Slide 136 text

DATA & AI BOOTCAMP 2024 Gemini Code Assist

Slide 137

Slide 137 text

DATA & AI BOOTCAMP 2024 Gemini Code Assist in Cloud Shell and VS Code Cloud Shell Editor VS Code

Slide 138

Slide 138 text

DATA & AI BOOTCAMP 2024 Enable Gemini for Google Cloud

Slide 139

Slide 139 text

DATA & AI BOOTCAMP 2024 To Start: Add Gemini Code Assist Extension in VSCode

Slide 140

Slide 140 text

DATA & AI BOOTCAMP 2024 Interact with LLM - Prompting Technique

Slide 141

Slide 141 text

DATA & AI BOOTCAMP 2024 Interact with LLM - Prompting Technique

Slide 142

Slide 142 text

DATA & AI BOOTCAMP 2024 Let’s build data pipeline with GenAI https://www.coingecko.com/

Slide 143

Slide 143 text

DATA & AI BOOTCAMP 2024 Gemini in BigQuery

Slide 144

Slide 144 text

DATA & AI BOOTCAMP 2024 Gemini in BigQuery

Slide 145

Slide 145 text

DATA & AI BOOTCAMP 2024 BigQuery Data Canvas

Slide 146

Slide 146 text

DATA & AI BOOTCAMP 2024 Understand data with Gemini in BigQuery Bigquery Public Dataset : https://console.cloud.google.com/marketplace/product/bigquery-public-data/thelook-ecommerce

Slide 147

Slide 147 text

DATA & AI BOOTCAMP 2024 Go to Workshop

Slide 148

Slide 148 text

DATA & AI BOOTCAMP 2024 BigQuery ML

Slide 149

Slide 149 text

DATA & AI BOOTCAMP 2024 BigQuery ML

Slide 150

Slide 150 text

DATA & AI BOOTCAMP 2024 Summary

Slide 151

Slide 151 text

DATA & AI BOOTCAMP 2024 AI-assisted: best practices

Slide 152

Slide 152 text

DATA & AI BOOTCAMP 2024 How Coding Assistance may shape your Dev Workflow ?

Slide 153

Slide 153 text

DATA & AI BOOTCAMP 2024 Balancing Between Generative AI Code Assist Benefit vs.Challenge

Slide 154

Slide 154 text

DATA & AI BOOTCAMP 2024 Human + AI prompting Interaction

Slide 155

Slide 155 text

DATA & AI BOOTCAMP 2024 Gemini’s Utility

Slide 156

Slide 156 text

DATA & AI BOOTCAMP 2024 Quiz Time!!