Slide 1

Slide 1 text

Synapse 101 Daron Yöndem http://daron.me @daronyondem

Slide 2

Slide 2 text

Platform Azure Data Lake Storage Common Data Model Enterprise Security Optimized for Analytics METASTORE SECURITY MANAGEMENT MONITORING DATA INTEGRATION Analytics Runtimes PROVISIONED SERVERLESS Form Factors SQL Languages Python .NET Java Scala R Experience Azure Synapse Studio Artificial Intelligence / Machine Learning / Internet of Things Intelligent Apps / Business Intelligence METASTORE SECURITY MANAGEMENT MONITORING

Slide 3

Slide 3 text

Linked Services • Linked services define the connection information needed to connect to external resources. • Easy cross platform data migration • Represents data store or compute resources

Slide 4

Slide 4 text

Datasets Once a dataset is defined, it can be used in pipelines and sources of data or as sinks of data.

Slide 5

Slide 5 text

Pipelines • 90+ Connectors • Various activities • Supports common loading patterns. • Fully parallel loading into data lake or SQL tables. • Graphical development experience.

Slide 6

Slide 6 text

Data Flows • Handle upserts, updates, deletes on sql sinks • Commonly used ETL patterns(Sequence generator/Lookup transformation/SCD…) • Add file handling (move files after read, write files to file names described in rows etc)

Slide 7

Slide 7 text

Triggers Triggers represent a unit of processing that determines when a pipeline execution needs to be kicked off. • Scheduled • Event based • Tumbling window

Slide 8

Slide 8 text

Modern Data Warehouse STORE VISUALIZE INGEST PREPARE TRANSFORM & ENRICH SERVE AZURE SYNAPSE ANALYTICS Data Sources

Slide 9

Slide 9 text

Exploratory Data Analysis Preparing To Transform

Slide 10

Slide 10 text

CSV Files

Slide 11

Slide 11 text

Parquet Files

Slide 12

Slide 12 text

Go batch

Slide 13

Slide 13 text

Applying Transformations

Slide 14

Slide 14 text

Code based transformations - Spark Starting from a table, auto- generate a single line of PySpark code that makes it easy to load a SQL table into a Spark dataframe​.

Slide 15

Slide 15 text

Code based transformations - SQL

Slide 16

Slide 16 text

Transform with Notebooks • Allows to write multiple languages in one notebook • Offers use of temporary tables across languages • Language support for Syntax highlight, syntax error, syntax code completion, smart indent, code folding • Export results

Slide 17

Slide 17 text

Transform with Pipelines and Data Flows

Slide 18

Slide 18 text

Transform with Serverless • An interactive query service that provides T-SQL queries over high scale data in Azure Storage. • No infrastructure • Pay only for query execution • T-SQL syntax to query data • Supports data in various formats (Parquet, CSV, JSON) • Support for BI ecosystem

Slide 19

Slide 19 text

Machine Learning in Azure Synapse Analytics

Slide 20

Slide 20 text

Making Predictions with T-SQL Azure Machine Learning or Azure Synapse Spark Notebook Train Model Convert to ONNX Export to Storage Storage Models Azure Synapse SQL SQL Script Read model Load into Table Insert Predictions Azure Synapse SQL SQL Script Load from Table Use Predict Create the model Register the model Use the model

Slide 21

Slide 21 text

From ingestion to visualization. Demo

Slide 22

Slide 22 text

Databricks vs Synapse • If you are primarily looking for a Data Warehousing solution, go with Azure Synapse Analytics. • If looking for a Spark solution and don’t have data warehousing needs, go with Azure Databricks. In case of Spark based ML scenarios, include Azure Machine Learning from within Azure Databricks for experiment tracking, automated machine learning and MLOPs. • If heavily invested in Spark and have data warehousing needs, go with both Azure Databricks and Azure Synapse.

Slide 23

Slide 23 text

Links worth sharing Microsoft Cloud Workshop for Azure Synapse Analytics and AI • https://drn.fyi/2FkuMCw Data and AI Engagement Accelerators for PoC • https://drn.fyi/3dgf87C

Slide 24

Slide 24 text

Thanks http://daron.me | @daronyondem Download slides here; http://decks.daron.me