Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Presenting Data 2024

Thomas E. Love
September 04, 2024

Presenting Data 2024

Presenting Effectively with Data, when you’re in a hurry

CRSP 413 Seminar, 2024-09-04

Thomas E. Love

September 04, 2024
Tweet

More Decks by Thomas E. Love

Other Decks in Science

Transcript

  1. Presenting Effectively with Data, when you’re in a hurry (…

    You’re always in a hurry) CRSP 413: Communication in Clinical Research Seminar 2024-09-04 Thomas E. Love, Ph.D. [email protected] https://speakerdeck.com/thomaselove/presenting-data-2024
  2. Presenting Research • Usually, this is highly abridged – Slide

    shows – Abstracts – Journal articles – Books – Websites • Announce the findings and try to convince us that the results are correct. Christopher Gandrud’s ideas, mostly – his book is Reproducible Research with R and R Studio
  3. You have Ten Minutes? • No time for subtlety. •

    Round, a lot. • Edit, ruthlessly. – One pass through software (“default options”) is never enough. – Better for people to leave the table hungry than stuffed. • Have something to say, and say it clearly. • Some possibilities are never a good choice. I am deeply in Andrew Gelman’s debt – see http://andrewgelman.com/
  4. All graphs are comparisons. All of statistics are comparisons. I

    am deeply in Andrew Gelman’s debt – see http://andrewgelman.com/
  5. First Law of Statistics: DTDP • Draw • The •

    D@$% • Picture A picture is worth a lot of numbers...
  6. We had to do a lot more from home in

    2020. Based on the American Time Use Survey, we spent about 62% of our waking time at home. In contrast, we only spent about 50% in 2019. Here is the breakdown by activity on a weekday.
  7. Clearly Communicating Quantitative Information • Are the most important elements

    or relationships visually most prominent? • Are the elements, symbol shapes and colors consistent with their use in previous graphs? • Are all of the graphical elements necessary to convey the relationships? • Are the graphical elements accurately positioned and scaled? http://www.datavis.ca/gallery/index.php
  8. What are you trying to do? • Is this information

    visualization (grabby, visually striking – dramatize the problem to draw the casual viewer in deeper) • Or statistical graphics (reveal patterns and discrepancies for viewers who are already interested in the problem) • Make tradeoffs carefully – meaningful choices.
  9. Source: Hermann Brenner, "Long-term survival rates of cancer patients achieved

    by the end of the 20th century: a period analysis," The Lancet, 360 (October 12, 2002), 1131- 1135. edwardtufte.com
  10. In addition to slopegraphs, consider sparklines: intense, simple, word-sized graphics

    The most common data display is a noun accompanied by a number. For example, a medical patient's current level of glucose is reported in a clinical record as a word and number: edwardtufte.com
  11. sparklines: intense, simple, word-sized graphics Placed in the relevant context,

    a single number gains meaning. Thus, the most recent measurement of glucose should be compared with earlier measurements for the patient. This data-line shows the path of the last 80 readings of glucose: edwardtufte.com
  12. sparklines: intense, simple, word-sized graphics Lacking a scale of measurement,

    this free- floating line is de-quantified. At least we do know the value of the line’s right-most data point, which corresponds to the most recent value of glucose, the number recorded at far right. Both representations of the most recent reading are tied together with a color accent: edwardtufte.com
  13. sparklines: intense, simple, word-sized graphics Some useful context is provided

    by showing the normal range of glucose, here as a gray band. Compared to normal limits, readings above the band horizon are elevated, those below reduced: edwardtufte.com
  14. sparklines: intense, simple, word-sized graphics For clinical analysis, the task

    is to detect quickly and assess wayward deviations from normal limits, shown here by visual deviations outside the gray band. Multiplying this format brings in additional data from the medical record; a stack, which can show hundreds of variables and thousands of measurements, allows fast effective parallel comparisons: edwardtufte.com
  15. sparklines: intense, simple, word-sized graphics These little data lines, because

    of their active quality over time, are named sparklines—small, high-resolution graphics usually embedded in a full context of words, numbers, images. Sparklines are datawords: data-intense, design- simple, word-sized graphics. edwardtufte.com
  16. Reproducible Research? • The goal of reproducible research is to

    tie specific instructions to data analysis so that scholarship can be recreated, better understood and verified. • This is usually facilitated by literate programming – a document that combines content and data analytic code. • Software? R and RStudio, mostly…
  17. Goals of Reproducible Analysis • Be able to reproduce your

    own results • Allow others to reproduce your results • Reproduce an entire report, manuscript, thesis, book, website with a single system command when changes occur in: – Operating system, stat software, graphics engines, source data, derived variables, analysis, interpretation • Save time • Provide the ultimate documentation of work done for a paper, etc. http://biostat.mc.vanderbilt.edu/wiki/pub/Main/ReproducibleResearchTutorial /HarrellScottTutorial-useR2012.pdf
  18. Five Practical Tips for Reproducible Research 1. Document everything 2.

    Everything is a (text) file 3. All files should be human-readable 4. Explicitly tie your files together 5. Have a plan to organize, store, and make your files available.
  19. But other people will use my data and code to

    compete with me? • True. • But competition means that strangers will read your papers, try to learn from them, cite them, and try to do even better. • If you prefer obscurity, why are you publishing? Donohue DL 2010
  20. Chatfield’s Six Rules for Data Analysis 1. Do not attempt

    to analyze the data until you understand what is being measured and why. 2. Find out how the data were collected. 3. Look at the structure of the data. 4. Carefully examine the data in an exploratory way, before attempting a more sophisticated analysis. 5. Use your common sense at all times. 6. Report the results in a clear, self-explanatory way. From Problem Solving: A Statistician’s Guide by Chris Chatfield, 2nd Edition, Chapman & Hall.
  21. You have Ten Minutes? • No time for subtlety. •

    Round, a lot. • Edit, ruthlessly. – One pass through software (“default options”) is never enough. – Better for people to leave the table hungry than stuffed. • Have something to say, and say it clearly. • Stay away from the pie.
  22. Statistics is too important to be left to statisticians. [email protected]

    https://speakerdeck.com/thomaselove/presenting-effectively-with-data-in-a-hurry See also: Karl Broman’s “Creating Effective Figures and Tables” (slides) at tinyurl.com/graphs2017