Data Literacy - Speaker Deck

Slide 1

Slide 1 text

1 BUILD SOFTWARE TO TEST SOFTWARE BUILD SOFTWARE TO TEST SOFTWARE exactpro.com Data Literacy Rostislav Yavorski Head of Research, Exactpro

Slide 2

Slide 2 text

2 BUILD SOFTWARE TO TEST SOFTWARE

Slide 3

Slide 3 text

3 BUILD SOFTWARE TO TEST SOFTWARE 3 BUILD SOFTWARE TO TEST SOFTWARE Basic concepts

Slide 4

Slide 4 text

4 BUILD SOFTWARE TO TEST SOFTWARE What is data literacy The ability to read, understand, create, and communicate data as information. Data literacy skills include the following abilities: ● Knowing what data is appropriate to use for a particular purpose ● Interpreting data visualizations ● Understanding data analytics tools and methods ● Data storytelling, communicating information about data to other people

Slide 5

Slide 5 text

5 BUILD SOFTWARE TO TEST SOFTWARE Data collection The process of gathering and measuring information on targeted variables in an established system, which then enables one to answer relevant questions and evaluate outcomes

Slide 6

Slide 6 text

6 BUILD SOFTWARE TO TEST SOFTWARE Data formats: CSV, JSON Comma Separated Values, CSV JavaScript Object Notation, JSON

Slide 7

Slide 7 text

7 BUILD SOFTWARE TO TEST SOFTWARE Data quality ● Accuracy, whether or not given values are correct and consistent ● Completeness, there should be no gaps or missing information ● Reliability, how well a method measures something ● Relevance, consistency between the content of data and the area of interest ● Timeliness, the time between when information is expected and when it is readily available for use

Slide 8

Slide 8 text

8 BUILD SOFTWARE TO TEST SOFTWARE Data visualization The graphic representation of data, particularly eﬃcient way of communicating Data visualization principles: ● show the data ● present many numbers in a small space ● encourage the eye to compare diﬀerent pieces of data ● reveal the data at several levels of detail ● serve clear purpose: description, exploration, or decoration

Slide 9

Slide 9 text

9 BUILD SOFTWARE TO TEST SOFTWARE 9 BUILD SOFTWARE TO TEST SOFTWARE Exploratory Data Analysis (EDA)

Slide 10

Slide 10 text

10 BUILD SOFTWARE TO TEST SOFTWARE Exploratory data analysis ● Size, format, source ● Data quality: missing values ● Minimum, maximum, median, quartiles for each parameter ● Outliers ● Simple visualization: boxplot, histogram, scatter plot

Slide 11

Slide 11 text

11 BUILD SOFTWARE TO TEST SOFTWARE Exploratory data analysis I would recommend: ● “What is Exploratory Data Analysis” by Mel Restori https://chartio.com/learn/data-analytics/what-is-exploratory-data-analysis/ ● “What is Exploratory Data Analysis” by Prasad Patil https://towardsdatascience.com/exploratory-data-analysis-8fc1cb20fd15

Slide 12

Slide 12 text

12 BUILD SOFTWARE TO TEST SOFTWARE Google Sheets Tutorials ● Railsware Product Academy, YouTube ● Google Cloud Skills Boost, Video+Docs+Quiz ● W3Schools, Lessons with practical examples ● Goodwill Community Foundation, 19 lessons

Slide 13

Slide 13 text

13 BUILD SOFTWARE TO TEST SOFTWARE 13 BUILD SOFTWARE TO TEST SOFTWARE Data Storytelling

Slide 14

Slide 14 text

14 BUILD SOFTWARE TO TEST SOFTWARE Storytelling with data https://www.storytellingwithdata.com/

Slide 15

Slide 15 text

15 BUILD SOFTWARE TO TEST SOFTWARE Storyboard example

Slide 16

Slide 16 text

16 BUILD SOFTWARE TO TEST SOFTWARE Storyboard example Start with an overview

Slide 17

Slide 17 text

17 BUILD SOFTWARE TO TEST SOFTWARE Storyboard example Zoom in one part

Slide 18

Slide 18 text

18 BUILD SOFTWARE TO TEST SOFTWARE Storyboard example Zoom in the other part

Slide 19

Slide 19 text

19 BUILD SOFTWARE TO TEST SOFTWARE Storyboard example Focus on relationships

Slide 20

Slide 20 text

20 BUILD SOFTWARE TO TEST SOFTWARE Storyboard example Conclude

Slide 21

Slide 21 text

21 BUILD SOFTWARE TO TEST SOFTWARE Storytelling with data https://www.storytellingwithdata.com/

Slide 22

Slide 22 text

22 BUILD SOFTWARE TO TEST SOFTWARE 22 BUILD SOFTWARE TO TEST SOFTWARE Homework assignment

Slide 23

Slide 23 text

23 BUILD SOFTWARE TO TEST SOFTWARE Todo list ● Make a list of several datasets you like ● Perform EDA on the one of your choice (a histogram and a scatter plot are vital) ● Make up a story to present your insights from the data (3-5 slides) ● Send to [email protected] ● Be ready to defend with a 5 minutes talk

Slide 24

Slide 24 text

24 BUILD SOFTWARE TO TEST SOFTWARE Example 1. Software Defect Prediction Size: 10 885 modules, 22 attributes ● 5 different lines of code measure ● 3 McCabe metrics (cyclomatic, essential, design complexity) ● 4 base Halstead measures (volume, length, difficulty, intelligence) ● 8 derived Halstead measures, a branch-count ● 1 goal field (module has/has not one or more reported defects) Hypotheses: ● code with complicated pathways are more error-prone ● code that is hard to read is more likely to be fault prone ● static measures can never be a certain indicator of the presence of a fault https://www.kaggle.com/datasets/semustafacevik/software-defect-prediction

Slide 25

Slide 25 text

25 BUILD SOFTWARE TO TEST SOFTWARE Example 2. Operational Data from Enterprise Application https://www.kaggle.com/datasets/anomalydetectionml/rawdata Goal: eﬀectively detect run-time anomalies using machine learning on operation metrics The dataset consists of metrics measured from the operating system and from WebLogic Server monitoring beans

Slide 26

Slide 26 text

26 BUILD SOFTWARE TO TEST SOFTWARE Thank you! Questions?