Pro Yearly is on sale from $80 to $50! »

Presenting Effectively with Data (in a Hurry)

Presenting Effectively with Data (in a Hurry)

Short talk given in various forms as part of CRSP 413 over the past few years. My goal here is to show and tell you a little about presenting effectively with data, and setting up your research life so that presenting effectively becomes easier.

16ab21d203d05eed47d07f525f7e11d5?s=128

Thomas E. Love

August 26, 2020
Tweet

Transcript

  1. Presenting Effectively with Data, when you’re in a hurry (…

    You’re always in a hurry) CRSP 413: Communication in Clinical Research Seminar 2020-08-26 Thomas E. Love, Ph.D. Thomas.Love@case.edu https://speakerdeck.com/thomaselove/presenting-effectively-with-data-in-a-hurry
  2. Presenting Research • Usually, this is highly abridged – Slide

    shows – Abstracts – Journal articles – Books – Websites • Announce the findings and try to convince us that the results are correct. Christopher Gandrud’s ideas, mostly – his book is Reproducible Research with R and R Studio
  3. None
  4. You have Ten Minutes? • No time for subtlety. •

    Round, a lot. • Edit, ruthlessly. – One pass through software (“default options”) is never enough. – Better for people to leave the table hungry than stuffed. • Have something to say, and say it clearly. • Some possibilities are never a good choice. I am deeply in Andrew Gelman’s debt – see http://andrewgelman.com/
  5. All graphs are comparisons. All of statistics are comparisons. I

    am deeply in Andrew Gelman’s debt – see http://andrewgelman.com/
  6. First Law of Statistics: DTDP • Draw • The •

    D@$% • Picture A picture is worth a lot of numbers...
  7. Karl Broman, “Creating Effective Figures and Tables” at tinyurl.com/graphs2017

  8. What’s wrong with this picture?

  9. None
  10. http://flowingdata.com/2009/11/26/fox-news-makes-the-best-pie-chart-ever/

  11. Which of these three bar graphs describes the same data

    as pie graph A?
  12. Stay away from the pie

  13. None
  14. https://twitter.com/HWippick/status/1118738492983521286/photo/1

  15. https://twitter.com/asher_rosinger/status/1119278062804328448/photo/1

  16. Clearly Communicating Quantitative Information • Are the most important elements

    or relationships visually most prominent? • Are the elements, symbol shapes and colors consistent with their use in previous graphs? • Are all of the graphical elements necessary to convey the relationships? • Are the graphical elements accurately positioned and scaled? http://www.datavis.ca/gallery/index.php
  17. Don’t clutter each plot…

  18. Small multiples

  19. None
  20. What are you trying to do? • Is this information

    visualization (grabby, visually striking – dramatize the problem to draw the casual viewer in deeper) • Or statistical graphics (reveal patterns and discrepancies for viewers who are already interested in the problem) • Make tradeoffs carefully – meaningful choices.
  21. From Karl Broman… Karl Broman, “Creating Effective Figures and Tables”

    at tinyurl.com/graphs2017
  22. Don’t sort alphabetically Karl Broman, “Creating Effective Figures and Tables”

    at tinyurl.com/graphs2017
  23. Karl Broman, “Creating Effective Figures and Tables” at tinyurl.com/graphs2017

  24. None
  25. None
  26. None
  27. None
  28. Reproducible Research? • The goal of reproducible research is to

    tie specific instructions to data analysis so that scholarship can be recreated, better understood and verified. • This is usually facilitated by literate programming – a document that combines content and data analytic code. • Software? R and RStudio, mostly…
  29. None
  30. Goals of Reproducible Analysis • Be able to reproduce your

    own results • Allow others to reproduce your results • Reproduce an entire report, manuscript, thesis, book, website with a single system command when changes occur in: – Operating system, stat software, graphics engines, source data, derived variables, analysis, interpretation • Save time • Provide the ultimate documentation of work done for a paper, etc. http://biostat.mc.vanderbilt.edu/wiki/pub/Main/ReproducibleResearchTutorial /HarrellScottTutorial-useR2012.pdf
  31. Five Practical Tips for Reproducible Research 1. Document everything 2.

    Everything is a (text) file 3. All files should be human-readable 4. Explicitly tie your files together 5. Have a plan to organize, store, and make your files available.
  32. Why we do this…

  33. But other people will use my data and code to

    compete with me? • True. • But competition means that strangers will read your papers, try to learn from them, cite them, and try to do even better. • If you prefer obscurity, why are you publishing? Donohue DL 2010
  34. https://leanpub.com/modernscientist A book about how to be a scientist the

    modern, open-source way.
  35. FiveThirtyEight (forecast 2020-08-25)

  36. None
  37. None
  38. FiveThirtyEight (forecast 2020-08-25)

  39. None
  40. None
  41. Chatfield’s Six Rules for Data Analysis 1. Do not attempt

    to analyze the data until you understand what is being measured and why. 2. Find out how the data were collected. 3. Look at the structure of the data. 4. Carefully examine the data in an exploratory way, before attempting a more sophisticated analysis. 5. Use your common sense at all times. 6. Report the results in a clear, self-explanatory way. From Problem Solving: A Statistician’s Guide by Chris Chatfield, 2nd Edition, Chapman & Hall.
  42. https://www.boredpanda.com/world-war-2-aircraft-survivorship-bias-abraham-wald/

  43. Howard Wainer, Visual Revelations Diagram of all of the places

    where the planes were damaged the most
  44. https://www.financialgazette.co.zw/survivorship-bias-how-data-driven-decisions-can- be-wrong/ww2-survivorship-bias-560x333/

  45. You have Ten Minutes? • No time for subtlety. •

    Round, a lot. • Edit, ruthlessly. – One pass through software (“default options”) is never enough. – Better for people to leave the table hungry than stuffed. • Have something to say, and say it clearly. • Stay away from the pie.
  46. Statistics is too important to be left to statisticians. Thomas.Love@case.edu

    https://speakerdeck.com/thomaselove/presenting-effectively-with-data-in-a-hurry See also: Karl Broman’s “Creating Effective Figures and Tables” (slides) at tinyurl.com/graphs2017
  47. On being “approximately right rather than exactly wrong” John Tukey

  48. Source: Hermann Brenner, "Long-term survival rates of cancer patients achieved

    by the end of the 20th century: a period analysis," The Lancet, 360 (October 12, 2002), 1131- 1135. edwardtufte.com
  49. edwardtufte.com

  50. Slopegraphs! edwardtufte.com

  51. edwardtufte.com

  52. In addition to slopegraphs, consider sparklines: intense, simple, word-sized graphics

    The most common data display is a noun accompanied by a number. For example, a medical patient's current level of glucose is reported in a clinical record as a word and number: edwardtufte.com
  53. sparklines: intense, simple, word-sized graphics Placed in the relevant context,

    a single number gains meaning. Thus, the most recent measurement of glucose should be compared with earlier measurements for the patient. This data-line shows the path of the last 80 readings of glucose: edwardtufte.com
  54. sparklines: intense, simple, word-sized graphics Lacking a scale of measurement,

    this free- floating line is de-quantified. At least we do know the value of the line’s right-most data point, which corresponds to the most recent value of glucose, the number recorded at far right. Both representations of the most recent reading are tied together with a color accent: edwardtufte.com
  55. sparklines: intense, simple, word-sized graphics Some useful context is provided

    by showing the normal range of glucose, here as a gray band. Compared to normal limits, readings above the band horizon are elevated, those below reduced: edwardtufte.com
  56. sparklines: intense, simple, word-sized graphics For clinical analysis, the task

    is to detect quickly and assess wayward deviations from normal limits, shown here by visual deviations outside the gray band. Multiplying this format brings in additional data from the medical record; a stack, which can show hundreds of variables and thousands of measurements, allows fast effective parallel comparisons: edwardtufte.com
  57. sparklines: intense, simple, word-sized graphics These little data lines, because

    of their active quality over time, are named sparklines—small, high-resolution graphics usually embedded in a full context of words, numbers, images. Sparklines are datawords: data-intense, design- simple, word-sized graphics. edwardtufte.com