Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Data Literacy

Data Literacy

On 7 June, Rostislav Yavorski, Head of Research, Exactpro, held the lecture on Data Literacy.

Data literacy is the ability to read, understand, create, and communicate data as information. Data literacy skills include knowing what data is appropriate to use, interpreting data visualisations, understanding data analytics tools and methods, data storytelling, etc.

---

Follow us on
LinkedIn https://www.linkedin.com/company/exactpro-systems-llc
Twitter https://twitter.com/exactpro

Exactpro
PRO

June 07, 2022
Tweet

More Decks by Exactpro

Other Decks in Technology

Transcript

  1. 1 BUILD SOFTWARE TO TEST SOFTWARE
    BUILD SOFTWARE TO TEST SOFTWARE
    exactpro.com
    Data Literacy
    Rostislav Yavorski
    Head of Research, Exactpro

    View Slide

  2. 2 BUILD SOFTWARE TO TEST SOFTWARE

    View Slide

  3. 3 BUILD SOFTWARE TO TEST SOFTWARE
    3 BUILD SOFTWARE TO TEST SOFTWARE
    Basic concepts

    View Slide

  4. 4 BUILD SOFTWARE TO TEST SOFTWARE
    What is data literacy
    The ability to read, understand, create, and communicate data as
    information.
    Data literacy skills include the following abilities:
    ● Knowing what data is appropriate to use for a particular
    purpose
    ● Interpreting data visualizations
    ● Understanding data analytics tools and methods
    ● Data storytelling, communicating information about data to
    other people

    View Slide

  5. 5 BUILD SOFTWARE TO TEST SOFTWARE
    Data collection
    The process of gathering and
    measuring information
    on targeted variables in an
    established system, which then
    enables one to answer relevant
    questions and evaluate outcomes

    View Slide

  6. 6 BUILD SOFTWARE TO TEST SOFTWARE
    Data formats: CSV, JSON
    Comma Separated Values, CSV JavaScript Object Notation, JSON

    View Slide

  7. 7 BUILD SOFTWARE TO TEST SOFTWARE
    Data quality
    ● Accuracy, whether or not given values are correct and consistent
    ● Completeness, there should be no gaps or missing information
    ● Reliability, how well a method measures something
    ● Relevance, consistency between the content of data and the area of interest
    ● Timeliness, the time between when information is expected and when it is
    readily available for use

    View Slide

  8. 8 BUILD SOFTWARE TO TEST SOFTWARE
    Data visualization
    The graphic representation of data, particularly efficient
    way of communicating
    Data visualization principles:
    ● show the data
    ● present many numbers in a small space
    ● encourage the eye to compare different pieces of
    data
    ● reveal the data at several levels of detail
    ● serve clear purpose: description, exploration, or
    decoration

    View Slide

  9. 9 BUILD SOFTWARE TO TEST SOFTWARE
    9 BUILD SOFTWARE TO TEST SOFTWARE
    Exploratory Data Analysis (EDA)

    View Slide

  10. 10 BUILD SOFTWARE TO TEST SOFTWARE
    Exploratory data analysis
    ● Size, format, source
    ● Data quality: missing values
    ● Minimum, maximum, median, quartiles for
    each parameter
    ● Outliers
    ● Simple visualization: boxplot, histogram,
    scatter plot

    View Slide

  11. 11 BUILD SOFTWARE TO TEST SOFTWARE
    Exploratory data analysis
    I would recommend:
    ● “What is Exploratory Data Analysis” by Mel Restori
    https://chartio.com/learn/data-analytics/what-is-exploratory-data-analysis/
    ● “What is Exploratory Data Analysis” by Prasad Patil
    https://towardsdatascience.com/exploratory-data-analysis-8fc1cb20fd15

    View Slide

  12. 12 BUILD SOFTWARE TO TEST SOFTWARE
    Google Sheets Tutorials
    ● Railsware Product Academy, YouTube
    ● Google Cloud Skills Boost, Video+Docs+Quiz
    ● W3Schools, Lessons with practical examples
    ● Goodwill Community Foundation, 19 lessons

    View Slide

  13. 13 BUILD SOFTWARE TO TEST SOFTWARE
    13 BUILD SOFTWARE TO TEST SOFTWARE
    Data Storytelling

    View Slide

  14. 14 BUILD SOFTWARE TO TEST SOFTWARE
    Storytelling with data
    https://www.storytellingwithdata.com/

    View Slide

  15. 15 BUILD SOFTWARE TO TEST SOFTWARE
    Storyboard example

    View Slide

  16. 16 BUILD SOFTWARE TO TEST SOFTWARE
    Storyboard example
    Start with
    an overview

    View Slide

  17. 17 BUILD SOFTWARE TO TEST SOFTWARE
    Storyboard example Zoom in one
    part

    View Slide

  18. 18 BUILD SOFTWARE TO TEST SOFTWARE
    Storyboard example
    Zoom in
    the other part

    View Slide

  19. 19 BUILD SOFTWARE TO TEST SOFTWARE
    Storyboard example
    Focus on
    relationships

    View Slide

  20. 20 BUILD SOFTWARE TO TEST SOFTWARE
    Storyboard example
    Conclude

    View Slide

  21. 21 BUILD SOFTWARE TO TEST SOFTWARE
    Storytelling with data
    https://www.storytellingwithdata.com/

    View Slide

  22. 22 BUILD SOFTWARE TO TEST SOFTWARE
    22 BUILD SOFTWARE TO TEST SOFTWARE
    Homework assignment

    View Slide

  23. 23 BUILD SOFTWARE TO TEST SOFTWARE
    Todo list
    ● Make a list of several datasets you like
    ● Perform EDA on the one of your choice
    (a histogram and a scatter plot are vital)
    ● Make up a story to present your insights
    from the data (3-5 slides)
    ● Send to [email protected]
    ● Be ready to defend with a 5 minutes talk

    View Slide

  24. 24 BUILD SOFTWARE TO TEST SOFTWARE
    Example 1. Software Defect Prediction
    Size: 10 885 modules, 22 attributes
    ● 5 different lines of code measure
    ● 3 McCabe metrics (cyclomatic, essential, design complexity)
    ● 4 base Halstead measures (volume, length, difficulty, intelligence)
    ● 8 derived Halstead measures, a branch-count
    ● 1 goal field (module has/has not one or more reported defects)
    Hypotheses:
    ● code with complicated pathways are more error-prone
    ● code that is hard to read is more likely to be fault prone
    ● static measures can never be a certain indicator of the presence of a fault
    https://www.kaggle.com/datasets/semustafacevik/software-defect-prediction

    View Slide

  25. 25 BUILD SOFTWARE TO TEST SOFTWARE
    Example 2. Operational Data from Enterprise Application
    https://www.kaggle.com/datasets/anomalydetectionml/rawdata
    Goal: effectively detect run-time
    anomalies using machine learning on
    operation metrics
    The dataset consists of metrics
    measured from the operating system
    and from WebLogic Server
    monitoring beans

    View Slide

  26. 26 BUILD SOFTWARE TO TEST SOFTWARE
    Thank you!
    Questions?

    View Slide