$30 off During Our Annual Pro Sale. View Details »

Presenting Effectively with Data (in a Hurry)

Thomas E. Love
September 12, 2023

Presenting Effectively with Data (in a Hurry)

Short talk given in various forms as part of CRSP 413 over the past few years. My goal here is to show and tell you a little about presenting effectively with data, and setting up your research life so that presenting effectively becomes easier.

Thomas E. Love

September 12, 2023
Tweet

More Decks by Thomas E. Love

Other Decks in Science

Transcript

  1. Presenting Effectively with Data,
    when you’re in a hurry
    (… You’re always in a hurry)
    CRSP 413: Communication in Clinical Research Seminar
    2023-09-13
    Thomas E. Love, Ph.D.
    [email protected]
    https://speakerdeck.com/thomaselove/presenting-effectively-with-data-in-a-hurry

    View Slide

  2. Presenting Research
    • Usually, this is highly abridged
    – Slide shows
    – Abstracts
    – Journal articles
    – Books
    – Websites
    • Announce the findings and try to convince us
    that the results are correct.
    Christopher Gandrud’s ideas, mostly – his book is Reproducible Research with R and R Studio

    View Slide

  3. View Slide

  4. You have Ten Minutes?
    • No time for subtlety.
    • Round, a lot.
    • Edit, ruthlessly.
    – One pass through software (“default options”) is
    never enough.
    – Better for people to leave the table hungry than
    stuffed.
    • Have something to say, and say it clearly.
    • Some possibilities are never a good choice.
    I am deeply in Andrew Gelman’s debt – see http://andrewgelman.com/

    View Slide

  5. All graphs are comparisons.
    All of statistics are comparisons.
    I am deeply in Andrew Gelman’s debt – see http://andrewgelman.com/

    View Slide

  6. First Law of Statistics: DTDP
    • Draw
    • The
    • D@$%
    • Picture
    A picture is worth a lot of numbers...

    View Slide

  7. Karl Broman, “Creating Effective Figures and Tables” at tinyurl.com/graphs2017

    View Slide

  8. What’s wrong with this picture?

    View Slide

  9. View Slide

  10. Which of these three bar graphs
    describes the same data as pie
    graph A?

    View Slide

  11. View Slide

  12. We had to do a lot more
    from home in 2020. Based
    on the American Time Use
    Survey, we spent about 62%
    of our waking time at home.
    In contrast, we only spent
    about 50% in 2019.
    Here is the breakdown by
    activity on a weekday.

    View Slide

  13. https://twitter.com/HWippick/status/1118738492983521286/photo/1

    View Slide

  14. https://twitter.com/asher_rosinger/status/1119278062804328448/photo/1

    View Slide

  15. Clearly Communicating Quantitative
    Information
    • Are the most important elements or
    relationships visually most prominent?
    • Are the elements, symbol shapes and colors
    consistent with their use in previous graphs?
    • Are all of the graphical elements necessary to
    convey the relationships?
    • Are the graphical elements accurately
    positioned and scaled?
    http://www.datavis.ca/gallery/index.php

    View Slide

  16. Don’t clutter each plot…

    View Slide

  17. Small multiples

    View Slide

  18. View Slide

  19. What are you trying to do?
    • Is this information visualization (grabby,
    visually striking – dramatize the problem to
    draw the casual viewer in deeper)
    • Or statistical graphics (reveal patterns and
    discrepancies for viewers who are already
    interested in the problem)
    • Make tradeoffs carefully – meaningful choices.

    View Slide

  20. From Karl Broman…
    Karl Broman, “Creating Effective Figures and Tables” at tinyurl.com/graphs2017

    View Slide

  21. Don’t sort alphabetically
    Karl Broman, “Creating Effective Figures and Tables” at tinyurl.com/graphs2017

    View Slide

  22. Karl Broman, “Creating Effective Figures and Tables” at tinyurl.com/graphs2017

    View Slide

  23. View Slide

  24. View Slide

  25. View Slide

  26. View Slide

  27. View Slide

  28. View Slide

  29. View Slide

  30. View Slide

  31. View Slide

  32. View Slide

  33. View Slide

  34. View Slide

  35. View Slide

  36. View Slide

  37. View Slide

  38. https://www.nytimes.com/interactive/2022/08/01/upshot/rich-poor-friendships.html

    View Slide

  39. View Slide

  40. On being “approximately right
    rather than exactly wrong”
    John Tukey

    View Slide

  41. Source: Hermann Brenner, "Long-term
    survival rates of cancer patients
    achieved by the end of the 20th
    century: a period analysis," The
    Lancet, 360 (October 12, 2002), 1131-
    1135.
    edwardtufte.com

    View Slide

  42. edwardtufte.com

    View Slide

  43. Slopegraphs!
    edwardtufte.com

    View Slide

  44. edwardtufte.com

    View Slide

  45. In addition to slopegraphs, consider
    sparklines: intense, simple, word-sized graphics
    The most common data display is a noun
    accompanied by a number.
    For example, a medical patient's current level of
    glucose is reported in a clinical record as a word
    and number:
    edwardtufte.com

    View Slide

  46. sparklines: intense, simple, word-sized graphics
    Placed in the relevant context, a single number
    gains meaning. Thus, the most recent
    measurement of glucose should be compared
    with earlier measurements for the patient. This
    data-line shows the path of the last 80 readings
    of glucose:
    edwardtufte.com

    View Slide

  47. sparklines: intense, simple, word-sized graphics
    Lacking a scale of measurement, this free-
    floating line is de-quantified. At least we do
    know the value of the line’s right-most data
    point, which corresponds to the most recent
    value of glucose, the number recorded at far
    right. Both representations of the most recent
    reading are tied together with a color accent:
    edwardtufte.com

    View Slide

  48. sparklines: intense, simple, word-sized graphics
    Some useful context is provided by showing the
    normal range of glucose, here as a gray band.
    Compared to normal limits, readings
    above the band horizon are elevated, those
    below reduced:
    edwardtufte.com

    View Slide

  49. sparklines: intense, simple, word-sized graphics
    For clinical analysis, the task is to detect quickly
    and assess wayward deviations from normal
    limits, shown here by visual deviations outside
    the gray band. Multiplying this format brings in
    additional data from the medical record; a stack,
    which can show hundreds of variables and
    thousands of
    measurements, allows
    fast effective parallel
    comparisons:
    edwardtufte.com

    View Slide

  50. sparklines: intense, simple, word-sized graphics
    These little data lines, because of their active
    quality over time, are named sparklines—small,
    high-resolution graphics usually embedded in a
    full context of words, numbers, images.
    Sparklines are datawords: data-intense, design-
    simple, word-sized graphics.
    edwardtufte.com

    View Slide

  51. Reproducible Research?
    • The goal of reproducible research is to tie
    specific instructions to data analysis so that
    scholarship can be recreated, better
    understood and verified.
    • This is usually facilitated by literate
    programming – a document that combines
    content and data analytic code.
    • Software? R and RStudio, mostly…

    View Slide

  52. View Slide

  53. Goals of Reproducible Analysis
    • Be able to reproduce your own results
    • Allow others to reproduce your results
    • Reproduce an entire report, manuscript, thesis,
    book, website with a single system command
    when changes occur in:
    – Operating system, stat software, graphics engines,
    source data, derived variables, analysis, interpretation
    • Save time
    • Provide the ultimate documentation of work
    done for a paper, etc.
    http://biostat.mc.vanderbilt.edu/wiki/pub/Main/ReproducibleResearchTutorial
    /HarrellScottTutorial-useR2012.pdf

    View Slide

  54. Five Practical Tips for Reproducible
    Research
    1. Document everything
    2. Everything is a (text) file
    3. All files should be human-readable
    4. Explicitly tie your files together
    5. Have a plan to organize, store, and make your
    files available.

    View Slide

  55. Why we do this…

    View Slide

  56. But other people will use my data and
    code to compete with me?
    • True.
    • But competition means that strangers will
    read your papers, try to learn from them, cite
    them, and try to do even better.
    • If you prefer obscurity, why are you
    publishing?
    Donohue DL 2010

    View Slide

  57. https://leanpub.com/modernscientist
    A book about how to be
    a scientist the modern,
    open-source way.

    View Slide

  58. Chatfield’s Six Rules for Data Analysis
    1. Do not attempt to analyze the data until you
    understand what is being measured and why.
    2. Find out how the data were collected.
    3. Look at the structure of the data.
    4. Carefully examine the data in an exploratory way,
    before attempting a more sophisticated analysis.
    5. Use your common sense at all times.
    6. Report the results in a clear, self-explanatory way.
    From Problem Solving: A Statistician’s Guide by
    Chris Chatfield, 2nd Edition, Chapman & Hall.

    View Slide

  59. https://www.boredpanda.com/world-war-2-aircraft-survivorship-bias-abraham-wald/

    View Slide

  60. Howard Wainer,
    Visual Revelations
    Diagram of all of the places where the planes
    were damaged the most

    View Slide

  61. https://www.financialgazette.co.zw/survivorship-bias-how-data-driven-decisions-can-
    be-wrong/ww2-survivorship-bias-560x333/

    View Slide

  62. You have Ten Minutes?
    • No time for subtlety.
    • Round, a lot.
    • Edit, ruthlessly.
    – One pass through software (“default options”) is
    never enough.
    – Better for people to leave the table hungry than
    stuffed.
    • Have something to say, and say it clearly.
    • Stay away from the pie.

    View Slide

  63. Statistics is too important to be
    left to statisticians.
    [email protected]
    https://speakerdeck.com/thomaselove/presenting-effectively-with-data-in-a-hurry
    See also: Karl Broman’s “Creating Effective Figures and Tables” (slides)
    at tinyurl.com/graphs2017

    View Slide