Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Simplifying Data Analysis & Visualization with Developer Tools & AI

Simplifying Data Analysis & Visualization with Developer Tools & AI

Having data analysis and visualization skills is increasingly important in the new age of Large Language Models and generative AI. But how does a non-Python developer skill up rapidly with the tools & best practices required to achieve project goals, without having the benefit of years of Python or data science experience? This is where the right developer tooling, with a little bit of AI assistance, can help.

In this talk, we'll go from identifying an open-source data set, to analyzing it for insights and visualizing relevant outcomes, in 25 minutes - with just a GitHub account and an OpenAI endpoint.

Along the way, we'll introduce you to a series of developer tools that make your journey easier:
- Open Dataset: to "analyze" - from Kaggle, Hugging Face, or Azure
- Data Wrangler: to "sanitize" data - extension from Visual Studio Code
- Jupyter Notebook: to "record" process - for transferable learning
- GitHub Codespaces: to "pre-build" environment - for consistent reuse
- GitHub Copilot: to "explain/fix" code - for focused learning with AI help
- Microsoft LIDA: to "suggest/build" visualization goals - for building your intuition with AI help

The talk comes with an associated repo that you can fork - then replace with your own dataset to extend or experiment on your own later. By the end of the talk you should have a sense of how you can go from discovering a data set to getting some visual insights about it, using existing tools with a little AI assistance.

Nitya Narasimhan, PhD

March 14, 2024
Tweet

More Decks by Nitya Narasimhan, PhD

Other Decks in Technology

Transcript

  1. Simplifying Data Analysis & Visualization with Developer Tools & AI

    Nitya Narasimhan, PhD Senior AI Advocate, Microsoft @nitya | #in/nityan
  2. Data Science Day 2024 | Nitya Narasimhan, PhD 011 Setup

    a consistent and reusable dev environment using GitHub Codespaces | Exercise Instantiate the Codespaces- Jupyter template & launch it 021 Explore Jupyter notebooks for data science & machine learning examples | Exercise Validate ability to run Jupyter notebooks without added effort 031 Add GitHub Copilot extension. Explore use to create notebooks and explain examples | Exercise Create notebooks, learn Python data structures & visualization 041 Use a Visual Studio Code Data Science profile & extensions in your devcontainer | Exercise Complete the VS Code datasci tutorial, explore Data Wrangler 051 Explore open datasets (curated & shared by the ML community) to start exploration | Exercise Load & explore dataset from Hugging Face, Kaggle, Azure 061 Understand principles of responsible AI and use toolbox to train & debug your model | Exercise Explore text or tabular data & model from Hugging Face 071 Explore LLM-based data visualization with Microsoft LIDA for intuition, suggestions | Exercise Use natural language to get goals, visualizations & refine 081 Make the paradigm shift from ML Ops to LLM Ops (predictive to generative AI apps) | Exercise Explore the Azure AI Studio (UI & SDK) capabilities 091 Customize & extend the template to suit your learning needs and share feedback! | Exercise Pick a different open dataset and try these steps yourself 101 Related resources for self-guided learners to continue their journey. Thank you! Q&A | Exercise See #14DaysOf DataScience posts on Developer Tools What We’ll Cover Today – Fork The Repo To Follow Along! https://aka.ms/workshops/python-data-analysis
  3. Data Science Day 2024 | Nitya Narasimhan, PhD Data analysis

    – drives the ML models – that power AI algorithms Image Credit | Microsoft Learn must-have skill for a data scientist good-to-have skill for an AI developer trends show a shift left in the application lifecycle giving developers more responsibility in earlier stages of workflow its just fun to explore data and gain insights The Motivation – Why should I learn Data Analysis?
  4. Data Science Day 2024 | Nitya Narasimhan, PhD KNOWLEDGE GAP

    I know what I don’t know but I can plan my journey I don’t know what I don’t know so how do I even start? Use Developer Tools Use AI Assistance INTUITION GAP The Challenge – What stops me from learning it?
  5. Data Science Day 2024 | Nitya Narasimhan, PhD I lack

    data science & Python expertise I can’t do this! I want to learn how to do <data analysis & visualization> I have dev tools & AI assistance Where do I start? The Mindset – How can I cultivate my curiosity to learn? Tired Wired
  6. Data Science Day 2024 | Nitya Narasimhan, PhD Help me

    get setup and productive quickly .. The Approach – how can I practice goal-oriented learning? Make it FRICTIONLESS Make progress towards goal without distractions Keep it FOCUSED Make it reproducible by others for collaboration Make it FRIENDLY ”hey - it works on my machine..” ”let me google this – should be quick” ”I don’t remember – let me explain it”
  7. Data Science Day 2024 | Nitya Narasimhan, PhD 011 Setup

    a consistent and reusable dev environment using GitHub Codespaces | Exercise Instantiate the Codespaces- Jupyter template & launch it 021 Explore Jupyter notebooks for data science & machine learning examples | Exercise Validate ability to run Jupyter notebooks without added effort 031 Add GitHub Copilot extension. Explore use to create notebooks and explain examples | Exercise Create notebooks, learn Python data structures & visualization 041 Use a Visual Studio Code Data Science profile & extensions in your devcontainer | Exercise Complete the VS Code datasci tutorial, explore Data Wrangler 051 Explore open datasets (curated & shared by the ML community) to start exploration | Exercise Load & explore dataset from Hugging Face, Kaggle, Azure 061 Understand principles of responsible AI and use toolbox to train & debug your model | Exercise Explore text or tabular data & model from Hugging Face 071 Explore LLM-based data visualization with Microsoft LIDA for intuition, suggestions | Exercise Use natural language to get goals, visualizations & refine 081 Make the paradigm shift from ML Ops to LLM Ops (predictive to generative AI apps) | Exercise Explore the Azure AI Studio (UI & SDK) capabilities 091 Customize & extend the template to suit your learning needs and share feedback! | Exercise Pick a different open dataset and try these steps yourself 101 Related resources for self-guided learners to continue their journey. Thank you! Q&A | Exercise See #14DaysOf DataScience posts on Developer Tools https://aka.ms/workshops/python-data-analysis Make It Frictionless – I need a consistent dev environment with easy setup Challenge 01
  8. Data Science Day 2024 | Nitya Narasimhan, PhD A |

    Start Development with a suitable “Codespaces” Template This repo extends the codespaces-jupyter template from GitHub Exercise 01 See: https://aka.ms/workshops/python-data-analysis You’ll find updated exercises in the the `data-science-day- 2024` branch Uncheck this before you fork repo, to get all branches copied
  9. Data Science Day 2024 | Nitya Narasimhan, PhD B |

    Fork the template & launch it with GitHub Codespaces (cloud) Exercise 01 See: https://aka.ms/workshops/python-data-analysis You can now launch a GitHub Codespaces instance directly from branch in the browser
  10. Data Science Day 2024 | Nitya Narasimhan, PhD B |

    Clone the template, and launch it with Docker Desktop (local) Exercise 01 See: https://aka.ms/workshops/python-data-analysis Or you can clone it to your local device and open it in Visual Studio Code .. If you have the Dev Container extension installed and a Docker Desktop running, you should see this ..
  11. Data Science Day 2024 | Nitya Narasimhan, PhD C |

    Get a pre-built dev environment that works the same for everyone Exercise 01 See: https://aka.ms/workshops/python-data-analysis Either way, you have a Visual Studio Code environment that is setup with a pre- built dev container with all dependencies installed for you – with no added effort.
  12. Data Science Day 2024 | Nitya Narasimhan, PhD D |

    Learn How Dev Containers Work (Cloud vs. Local) Exercise 01 See: https://aka.ms/workshops/python-data-analysis The container runs a Visual Studio Code server that you can connect to from a Visual Studio Code client (browser or local) – with the repository being visible to both. Configuration as code - a devcontainer.json file defines the environment, is version controlled like any other file.
  13. Data Science Day 2024 | Nitya Narasimhan, PhD 011 Setup

    a consistent and reusable dev environment using GitHub Codespaces | Exercise Instantiate the Codespaces- Jupyter template & launch it 021 Explore Jupyter notebooks for data science & machine learning examples | Exercise Validate ability to run Jupyter notebooks without added effort 031 Add GitHub Copilot extension. Explore use to create notebooks and explain examples | Exercise Create notebooks, learn Python data structures & visualization 041 Use a Visual Studio Code Data Science profile & extensions in your devcontainer | Exercise Complete the VS Code datasci tutorial, explore Data Wrangler 051 Explore open datasets (curated & shared by the ML community) to start exploration | Exercise Load & explore dataset from Hugging Face, Kaggle, Azure 061 Understand principles of responsible AI and use toolbox to train & debug your model | Exercise Explore text or tabular data & model from Hugging Face 071 Explore LLM-based data visualization with Microsoft LIDA for intuition, suggestions | Exercise Use natural language to get goals, visualizations & refine 081 Make the paradigm shift from ML Ops to LLM Ops (predictive to generative AI apps) | Exercise Explore the Azure AI Studio (UI & SDK) capabilities 091 Customize & extend the template to suit your learning needs and share feedback! | Exercise Pick a different open dataset and try these steps yourself 101 Related resources for self-guided learners to continue their journey. Thank you! Q&A | Exercise See #14DaysOf DataScience posts on Developer Tools https://aka.ms/workshops/python-data-analysis Make It Friendly – I need a reproducible environment for easy collaboration Challenge 02
  14. Data Science Day 2024 | Nitya Narasimhan, PhD A |

    “matplotlib” is the most popular library for 2D data visualizations Exercise 02 See: https://aka.ms/workshops/python-data-analysis Open the matplotlib example notebook & select a kernel using existing Python env.
  15. Data Science Day 2024 | Nitya Narasimhan, PhD B |

    Learn to run, edit, extend – Jupyter Notebooks in this environment Exercise 02 See: https://aka.ms/workshops/python-data-analysis Run All – executes all code cells in selected Python 3.10.13 env No added effort in setup. Ability to add “markdown” cells for code to document it Modify code or data to explore ideas in an interactive way Share notebook with collaborators for contributions, debug
  16. Data Science Day 2024 | Nitya Narasimhan, PhD C |

    Run the “pandas+matplotlib” example and intuit how it works Exercise 02 See: https://aka.ms/workshops/python-data-analysis Open the population example notebook and run it as before Note how pandas works by creating a df (data frame) from structured data (CSV) Then uses matplotlib to create a plot using data from 2 separate columns of that table
  17. Data Science Day 2024 | Nitya Narasimhan, PhD D |

    Now copy the code cell, change data – and see if intuition holds up Exercise 02 See: https://aka.ms/workshops/python-data-analysis The good news is that I can modify code and experiment inline to learn by doing. The bad news is that I don’t understand what the code does – or if there are better options I could use. Let’s try a cricket data set (IPL 2022) shared on Kaggle by a fan
  18. Data Science Day 2024 | Nitya Narasimhan, PhD 011 Setup

    a consistent and reusable dev environment using GitHub Codespaces | Exercise Instantiate the Codespaces- Jupyter template & launch it 021 Explore Jupyter notebooks for data science & machine learning examples | Exercise Validate ability to run Jupyter notebooks without added effort 031 Add GitHub Copilot extension. Explore use to create notebooks and explain examples | Exercise Create notebooks, learn Python data structures & visualization 041 Use a Visual Studio Code Data Science profile & extensions in your devcontainer | Exercise Complete the VS Code datasci tutorial, explore Data Wrangler 051 Explore open datasets (curated & shared by the ML community) to start exploration | Exercise Load & explore dataset from Hugging Face, Kaggle, Azure 061 Understand principles of responsible AI and use toolbox to train & debug your model | Exercise Explore text or tabular data & model from Hugging Face 071 Explore LLM-based data visualization with Microsoft LIDA for intuition, suggestions | Exercise Use natural language to get goals, visualizations & refine 081 Make the paradigm shift from ML Ops to LLM Ops (predictive to generative AI apps) | Exercise Explore the Azure AI Studio (UI & SDK) capabilities 091 Customize & extend the template to suit your learning needs and share feedback! | Exercise Pick a different open dataset and try these steps yourself 101 Related resources for self-guided learners to continue their journey. Thank you! Q&A | Exercise See #14DaysOf DataScience posts on Developer Tools https://aka.ms/workshops/python-data-analysis Keep it Focused – I want to progress in my goal without getting distracted Challenge 03
  19. Data Science Day 2024 | Nitya Narasimhan, PhD A |

    Install the GitHub Copilot Extension – activate inline AI assistance Exercise 03 See: https://aka.ms/workshops/python-data-analysis Add the extension to devcontainer.json if you want it installed by default in that env Github Copilot is a paid offering with a free trial to explore it.
  20. Data Science Day 2024 | Nitya Narasimhan, PhD B |

    Use GitHub Copilot in Chat mode – create notebooks to learn Exercise 03 See: https://aka.ms/workshops/python-data-analysis Using inline mode sets context to that specific file context Using chat mode sets context to workspace with richer options Use chat mode to create a notebook to learn pandas usage
  21. Data Science Day 2024 | Nitya Narasimhan, PhD C |

    Use GitHub Copilot inline – ask for explainers or fix errors in context Exercise 03 See: https://aka.ms/workshops/python-data-analysis The copilot-created notebook has errors but is a good starter for learning phases. Ask copilot chat to explain bug. See how it references the file. Ask copilot inline to fix the bug. See how it gives you a choice.
  22. Data Science Day 2024 | Nitya Narasimhan, PhD E |

    Explore GitHub Copilot suggestions – fill knowledge & intuition gaps Exercise 03 See: https://aka.ms/workshops/python-data-analysis Use suggested next prompts to fill in knowledge gaps – without losing focus Stop if knowledge gap is filled. Pivot from asking to doing Instead of “googling” and falling into a rabbit hole of search results, use Copilot as a contextual question-answer system that keeps you inside the development environment – and ties responses to code references Build intuition by trying suggestions and learning to make connections between code and outcomes (success or failure)
  23. Data Science Day 2024 | Nitya Narasimhan, PhD F |

    Explore GitHub Copilot for goal-oriented task – prompt engineering Exercise 03 See: https://aka.ms/workshops/python-data-analysis Open a new code cell and write a prompt to get your task done Refine the prompt interactively to move closer to desired goal Use suggestions to refactor code, goals. Add markdown to recall insights later
  24. Data Science Day 2024 | Nitya Narasimhan, PhD 011 Setup

    a consistent and reusable dev environment using GitHub Codespaces | Exercise Instantiate the Codespaces- Jupyter template & launch it 021 Explore Jupyter notebooks for data science & machine learning examples | Exercise Validate ability to run Jupyter notebooks without added effort 031 Add GitHub Copilot extension. Explore use to create notebooks and explain examples | Exercise Create notebooks, learn Python data structures & visualization 041 Use a Visual Studio Code Data Science profile & extensions in your devcontainer | Exercise Complete the VS Code datasci tutorial, explore Data Wrangler 051 Explore open datasets (curated & shared by the ML community) to start exploration | Exercise Load & explore dataset from Hugging Face, Kaggle, Azure 061 Understand principles of responsible AI and use toolbox to train & debug your model | Exercise Explore text or tabular data & model from Hugging Face 071 Explore LLM-based data visualization with Microsoft LIDA for intuition, suggestions | Exercise Use natural language to get goals, visualizations & refine 081 Make the paradigm shift from ML Ops to LLM Ops (predictive to generative AI apps) | Exercise Explore the Azure AI Studio (UI & SDK) capabilities 091 Customize & extend the template to suit your learning needs and share feedback! | Exercise Pick a different open dataset and try these steps yourself 101 Related resources for self-guided learners to continue their journey. Thank you! Q&A | Exercise See #14DaysOf DataScience posts on Developer Tools https://aka.ms/workshops/python-data-analysis Make It Friendly – I need a reproducible environment for easy collaboration Challenge 04
  25. Data Science Day 2024 | Nitya Narasimhan, PhD A |

    Use Visual Studio Code Profiles – Customize editor for productivity Exercise 04 See: https://aka.ms/workshops/python-data-analysis Start with the Data Science Profile to get popular extensions.
  26. Data Science Day 2024 | Nitya Narasimhan, PhD B |

    Activate Data Wrangler Extension – view & edit for data cleaning Exercise 04 See: https://aka.ms/workshops/python-data-analysis
  27. Data Science Day 2024 | Nitya Narasimhan, PhD C |

    Editing in Data Wrangler Extension – operations auto-generate code Exercise 04 See: https://aka.ms/workshops/python-data-analysis
  28. Data Science Day 2024 | Nitya Narasimhan, PhD 011 Setup

    a consistent and reusable dev environment using GitHub Codespaces | Exercise Instantiate the Codespaces- Jupyter template & launch it 021 Explore Jupyter notebooks for data science & machine learning examples | Exercise Validate ability to run Jupyter notebooks without added effort 031 Add GitHub Copilot extension. Explore use to create notebooks and explain examples | Exercise Create notebooks, learn Python data structures & visualization 041 Use a Visual Studio Code Data Science profile & extensions in your devcontainer | Exercise Complete the VS Code datasci tutorial, explore Data Wrangler 051 Explore open datasets (curated & shared by the ML community) to start exploration | Exercise Load & explore dataset from Hugging Face, Kaggle, Azure 061 Understand principles of responsible AI and use toolbox to train & debug your model | Exercise Explore text or tabular data & model from Hugging Face 071 Explore LLM-based data visualization with Microsoft LIDA for intuition, suggestions | Exercise Use natural language to get goals, visualizations & refine 081 Make the paradigm shift from ML Ops to LLM Ops (predictive to generative AI apps) | Exercise Explore the Azure AI Studio (UI & SDK) capabilities 091 Customize & extend the template to suit your learning needs and share feedback! | Exercise Pick a different open dataset and try these steps yourself 101 Related resources for self-guided learners to continue their journey. Thank you! Q&A | Exercise See #14DaysOf DataScience posts on Developer Tools https://aka.ms/workshops/python-data-analysis Make It Frictionless – I have the tools & environment. How about the data? Challenge 05
  29. Data Science Day 2024 | Nitya Narasimhan, PhD A |

    Explore Kaggle Datasets – use community notebooks for inspiration Exercise 05 See: https://aka.ms/workshops/python-data-analysis IPL 2022 Dataset Open Data Commons License downloaded Oct 2023 Example EDA on Kaggle. Find a dataset in a domain of interest (ideas for insights) Find EDA examples from the community to get intuition on how to explore data Find ML model examples based on that dataset, to learn new libraries (sklearn) https://www.kaggle.com/code/coolboyraghu/ipl-score-prediction
  30. Data Science Day 2024 | Nitya Narasimhan, PhD B |

    Explore Hugging Face – curated datasets for Deep Learning models Exercise 05 See: https://huggingface.co/docs/datasets/index Find datasets for new tasks and explore new libraries and tutorials
  31. Data Science Day 2024 | Nitya Narasimhan, PhD C |

    Explore Azure Open Datasets – curated datasets for many domains Exercise 05 See: https://learn.microsoft.com/azure/open-datasets/ Expand your understanding from community curated dataset to big data mindset
  32. Data Science Day 2024 | Nitya Narasimhan, PhD 011 Setup

    a consistent and reusable dev environment using GitHub Codespaces | Exercise Instantiate the Codespaces- Jupyter template & launch it 021 Explore Jupyter notebooks for data science & machine learning examples | Exercise Validate ability to run Jupyter notebooks without added effort 031 Add GitHub Copilot extension. Explore use to create notebooks and explain examples | Exercise Create notebooks, learn Python data structures & visualization 041 Use a Visual Studio Code Data Science profile & extensions in your devcontainer | Exercise Complete the VS Code datasci tutorial, explore Data Wrangler 051 Explore open datasets (curated & shared by the ML community) to start exploration | Exercise Load & explore dataset from Hugging Face, Kaggle, Azure 061 Understand principles of responsible AI and use toolbox to train & debug your model | Exercise Explore text or tabular data & model from Hugging Face 071 Explore LLM-based data visualization with Microsoft LIDA for intuition, suggestions | Exercise Use natural language to get goals, visualizations & refine 081 Make the paradigm shift from ML Ops to LLM Ops (predictive to generative AI apps) | Exercise Explore the Azure AI Studio (UI & SDK) capabilities 091 Customize & extend the template to suit your learning needs and share feedback! | Exercise Pick a different open dataset and try these steps yourself 101 Related resources for self-guided learners to continue their journey. Thank you! Q&A | Exercise See #14DaysOf DataScience posts on Developer Tools https://aka.ms/workshops/python-data-analysis Make It Friendly – Debug models and decision-making for responsible AI Challenge 06
  33. Data Science Day 2024 | Nitya Narasimhan, PhD A |

    Explore Responsible AI Toolkit – one notebook example at a time! Exercise 06 See: https://responsibleaitoolbox.ai/ Model Debugging Decision-Making Let's imagine that the diabetes progression scores predicted by the model are used to determine medical insurance rates. If the score is greater than 120, there is a higher rate. Patient 43's model score of 268.08 results in this increased rate, and they want to know how they should change their health to get a lower rate prediction from the model (leading to lower insurance price). The What-If counterfactuals component shows how slightly different feature values affect model predictions. This can be used to solve Patient 43's problem. https://github.com/microsoft/responsible-ai-toolbox/tree/main/notebooks/responsibleaidashboard/tabular
  34. Data Science Day 2024 | Nitya Narasimhan, PhD 011 Setup

    a consistent and reusable dev environment using GitHub Codespaces | Exercise Instantiate the Codespaces- Jupyter template & launch it 021 Explore Jupyter notebooks for data science & machine learning examples | Exercise Validate ability to run Jupyter notebooks without added effort 031 Add GitHub Copilot extension. Explore use to create notebooks and explain examples | Exercise Create notebooks, learn Python data structures & visualization 041 Use a Visual Studio Code Data Science profile & extensions in your devcontainer | Exercise Complete the VS Code datasci tutorial, explore Data Wrangler 051 Explore open datasets (curated & shared by the ML community) to start exploration | Exercise Load & explore dataset from Hugging Face, Kaggle, Azure 061 Understand principles of responsible AI and use toolbox to train & debug your model | Exercise Explore text or tabular data & model from Hugging Face 071 Explore LLM-based data visualization with Microsoft LIDA for intuition, suggestions | Exercise Use natural language to get goals, visualizations & refine 081 Make the paradigm shift from ML Ops to LLM Ops (predictive to generative AI apps) | Exercise Explore the Azure AI Studio (UI & SDK) capabilities 091 Customize & extend the template to suit your learning needs and share feedback! | Exercise Pick a different open dataset and try these steps yourself 101 Related resources for self-guided learners to continue their journey. Thank you! Q&A | Exercise See #14DaysOf DataScience posts on Developer Tools https://aka.ms/workshops/python-data-analysis Keep it focused– I need to build my intuition but I don’t know where to start Challenge 07
  35. Data Science Day 2024 | Nitya Narasimhan, PhD LIDA is

    a library for generating data visualizations and data-faithful infographics. LIDA is grammar agnostic (will work with any programming language and visualization libraries e.g. matplotlib, seaborn, altair, d3 etc) and works with multiple large language model providers (OpenAI, Azure OpenAI, PaLM, Cohere, Huggingface). Research Paper: https://arxiv.org/abs/2303.02927 Explore Project LIDA – Visualize Data using LLM & Natural Language Prompts Exercise 07 See: https://aka.ms/lida/org
  36. Data Science Day 2024 | Nitya Narasimhan, PhD A |

    Add Your Open AI Key – Codespaces Secret vs. Local Env Variable Exercise 07
  37. Data Science Day 2024 | Nitya Narasimhan, PhD B |

    Ask LIDA to generate a summary of the dataset Exercise 07 Using older screenshots (vs. live demo) given issues in OpenAI today
  38. Data Science Day 2024 | Nitya Narasimhan, PhD C |

    Generate Goals for me from the data – build intuition on what, how Exercise 07
  39. Data Science Day 2024 | Nitya Narasimhan, PhD D |

    Generate goals for me – but make them customized to my persona Exercise 07
  40. Data Science Day 2024 | Nitya Narasimhan, PhD E |

    Show me different ways to visualize the data – for the same goal Exercise 07
  41. Data Science Day 2024 | Nitya Narasimhan, PhD GitHub Copilot

    can do this too .. But you can vary the parameters and customize base prompt here programmatically It’s open source so you can do more if needed F | Ask questions of the data in natural language – and get visualizations Exercise 07
  42. Data Science Day 2024 | Nitya Narasimhan, PhD Provides flexibility

    for trial-and-error experiments to build intuition. G | Prompt Engineering works in user queries Exercise 07
  43. Data Science Day 2024 | Nitya Narasimhan, PhD lida/components/viz/vizexplainer.py system_prompt

    = """ You are a helpful assistant highly skilled in providing helpful, structured explanations of visualization of the plot(data: pd.DataFrame) method in the provided code. You divide the code into sections and provide a description of each section and an explanation. The first section should be named "accessibility" and describe the physical appearance of the chart (colors, chart type etc), the goal of the chart, as well the main insights from the chart. You can explain code across the following 3 dimensions: 1. accessibility: the physical appearance of the chart (colors, chart type etc), the goal of the chart, as well the main insights from the chart. 2. transformation: This should describe the section of the code that applies any kind of data transformation (filtering, aggregation, grouping, null value handling etc) 3. visualization: step by step description of the code that creates or modifies the presented visualization. """ H | Get Explanations For Decisions – Understand why this visualization Exercise 07
  44. Data Science Day 2024 | Nitya Narasimhan, PhD I |

    Get Recommendations for Visualizations relevant to your dataset Exercise 07
  45. Data Science Day 2024 | Nitya Narasimhan, PhD 011 Setup

    a consistent and reusable dev environment using GitHub Codespaces | Exercise Instantiate the Codespaces- Jupyter template & launch it 021 Explore Jupyter notebooks for data science & machine learning examples | Exercise Validate ability to run Jupyter notebooks without added effort 031 Add GitHub Copilot extension. Explore use to create notebooks and explain examples | Exercise Create notebooks, learn Python data structures & visualization 041 Use a Visual Studio Code Data Science profile & extensions in your devcontainer | Exercise Complete the VS Code datasci tutorial, explore Data Wrangler 051 Explore open datasets (curated & shared by the ML community) to start exploration | Exercise Load & explore dataset from Hugging Face, Kaggle, Azure 061 Understand principles of responsible AI and use toolbox to train & debug your model | Exercise Explore text or tabular data & model from Hugging Face 071 Explore LLM-based data visualization with Microsoft LIDA for intuition, suggestions | Exercise Use natural language to get goals, visualizations & refine 081 Make the paradigm shift from ML Ops to LLM Ops (predictive to generative AI apps) | Exercise Explore the Azure AI Studio (UI & SDK) capabilities 091 Customize & extend the template to suit your learning needs and share feedback! | Exercise Pick a different open dataset and try these steps yourself 101 Related resources for self-guided learners to continue their journey. Thank you! Q&A | Exercise See #14DaysOf DataScience posts on Developer Tools https://aka.ms/workshops/python-data-analysis Make the Paradigm Shift – From MLOps to LLM Ops and Generative AI Challenge 09
  46. Data Science Day 2024 | Nitya Narasimhan, PhD Data analysis

    – drives the ML models – that power AI algorithms Image Credit | Microsoft Learn must-have skill for a data scientist good-to-have skill for an AI developer trends show a shift left in the application lifecycle giving developers more responsibility in earlier stages of workflow its just fun to explore data and gain insights Closing the Loop ..
  47. Data Science Day 2024 | Nitya Narasimhan, PhD ML Ops

    - App lifecycle for developing Predictive AI Image Credit | Microsoft Learn
  48. Data Science Day 2024 | Nitya Narasimhan, PhD LLM Ops

    - App lifecycle for developing Generative AI Image Credit | Microsoft Learn
  49. Data Science Day 2024 | Nitya Narasimhan, PhD Azure AI

    Week : https://aka.ms/ai-studio/intelligent-apps Image Credit | Microsoft Learn Azure AI Week : https://aka.ms/ai-studio/intelligent-apps
  50. Data Science Day 2024 | Nitya Narasimhan, PhD 011 Setup

    a consistent and reusable dev environment using GitHub Codespaces | Exercise Instantiate the Codespaces- Jupyter template & launch it 021 Explore Jupyter notebooks for data science & machine learning examples | Exercise Validate ability to run Jupyter notebooks without added effort 031 Add GitHub Copilot extension. Explore use to create notebooks and explain examples | Exercise Create notebooks, learn Python data structures & visualization 041 Use a Visual Studio Code Data Science profile & extensions in your devcontainer | Exercise Complete the VS Code datasci tutorial, explore Data Wrangler 051 Explore open datasets (curated & shared by the ML community) to start exploration | Exercise Load & explore dataset from Hugging Face, Kaggle, Azure 061 Understand principles of responsible AI and use toolbox to train & debug your model | Exercise Explore text or tabular data & model from Hugging Face 071 Explore LLM-based data visualization with Microsoft LIDA for intuition, suggestions | Exercise Use natural language to get goals, visualizations & refine 081 Make the paradigm shift from ML Ops to LLM Ops (predictive to generative AI apps) | Exercise Explore the Azure AI Studio (UI & SDK) capabilities 091 Customize & extend the template to suit your learning needs and share feedback! | Exercise Pick a different open dataset and try these steps yourself 101 Related resources for self-guided learners to continue their journey. Thank you! Q&A | Exercise See #14DaysOf DataScience posts on Developer Tools https://aka.ms/workshops/python-data-analysis Make It Friendly – I need a reproducible environment for easy collaboration Challenge 04
  51. Data Science Day 2024 | Nitya Narasimhan, PhD Help me

    get setup and productive quickly .. The Approach – how can I practice goal-oriented learning? Make it FRICTIONLESS Make progress towards goal without distractions Keep it FOCUSED Make it reproducible by others for collaboration Make it FRIENDLY Dev Containers GitHub Codespaces GitHub Copilot Microsoft LIDA Jupyter Notebooks VS Code Profiles
  52. Data Science Day 2024 | Nitya Narasimhan, PhD 1 |

    Introduction – Data Analysis Challenges & Goals 2 | GitHub Codespaces – Reusable environments 3 | Visual Studio Code – Productivity extensions 4 | GitHub Copilot – AI-assisted learning 5 | Open Datasets – Community-inspired exploration 6 | Responsible AI – Model debugging for fairness 7 | Project LIDA – AI-assisted intuition & visualization 8 | Azure AI Studio – Paradigm shift to LLM Ops 9 | Summary – Questions & Next Steps
  53. Data Science Day 2024 | Nitya Narasimhan, PhD Week 2:

    Developer Tools 16/3 | GitHub Codespaces 17/3 | Visual Studio Code 18/3 | GitHub Copilot 19/3 | Open Datasets 20/3 | Responsible AI 21/3 | Project LIDA 22/3 | Azure AI Studio #14DaysOfDataScience Browse The Posts : https://aka.ms/2024/data-science-recipes
  54. Data Science Day 2024 | Nitya Narasimhan, PhD Data Science

    Collection 1. Data Science Foundations 2. Cloud Skills Challenge 3. Resoponsible AI 4. Data Science Curriculum 5. Data Science Handbook 6. Hugging Face Datasets 7. Kaggle Online Courses Will be updated regularly Bookmark Collection: https://aka.ms/2024-datasci-collection