Upgrade to Pro — share decks privately, control downloads, hide ads and more …

BMMB554: Talks, Papers, Grants

BMMB554: Talks, Papers, Grants

What is the reality of Academia?

Anton Nekrutenko

April 12, 2017
Tweet

More Decks by Anton Nekrutenko

Other Decks in Education

Transcript

  1. Talks ‣ Most talks are boring ‣ Most slides are

    horrible ‣ Talks are the best medium to get your point across ‣ https://youtu.be/WAwDvbIfkos ‣ http://journals.plos.org/ploscompbiol/ article?id=10.1371/journal.pcbi.0030077
  2. Papers ‣ Authorship matters ‣ Make things clear ‣ Read

    more non-scientific literature ‣ http://journals.plos.org/ploscompbiol/ article?id=10.1371/journal.pcbi.1004205
  3. Data visualization ❖ A picture is worth a thousand words

    ❖ A critical part of being a successful scientist ❖ Should be done at all stages of the scientific process ❖ Data exploration ❖ Data analysis ❖ Final presentation ❖ Do NOT be the guy/girl that makes plots like this for a paper or ends up at http://wtfviz.net
  4. Do NOT abuse 3D Roeder K (1994) DNA fingerprinting: A

    review of the controversy (with discussion). Statistical Science 9:222-278, Figure 4 http://www.biostat.wisc.edu/~kbroman/topten_worstgraphs/
  5. The beautiful empty space Wittke-Thompson JK, Pluzhnikov A, Cox NJ

    (2005) Rational inferences about departures from Hardy- Weinberg equilibrium. American Journal of Human Genetics 76:967-986, Figure 1 http://www.biostat.wisc.edu/~kbroman/topten_worstgraphs/
  6. What a beautiful straight line Epstein MP, Satten GA (2003)

    Inference on haplotype effects in case-control studies using unphased genotype data. American Journal of Human Genetics 73:1316-1329, Figure 1 http://www.biostat.wisc.edu/~kbroman/topten_worstgraphs/
  7. How many data points is that? Hummer BT, Li XL,

    Hassel BA (2001) Role for p53 in gene induction by double-stranded RNA. J Virol 75:7774-7777, Figure 4 http://www.biostat.wisc.edu/~kbroman/topten_worstgraphs/
  8. Pie charts are evil Cawley S, et al. (2004) Unbiased

    mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell 116:499-509, Figure 1 http://www.biostat.wisc.edu/~kbroman/topten_worstgraphs/
  9. Odd choice of graph type Kim OY, et al. (2012)

    Higher levels of serum triglyceride and dietary carbohydrate intake are associated with smaller LDL particle size in healthy Korean women. Nutrition Research and Practice 6:120-125, Figure 1 http://www.biostat.wisc.edu/~kbroman/topten_worstgraphs/
  10. So much wasted ink Jorgenson E, et al. (2005) Ethnicity

    and human genetic linkage maps. American Journal of Human Genetics 76:276-290, Figure 2 http://www.biostat.wisc.edu/~kbroman/topten_worstgraphs/
  11. Kids blocks? Cotter DJ, et al. (2004) Hematocrit was not

    validated as a surrogate endpoint for survival amoung epoetin- treated hemodialysis patients. Journal of Clinical Epidemiology 57:1086-1095, Figure 2 http://www.biostat.wisc.edu/~kbroman/topten_worstgraphs/
  12. 3:1 is better than 1:3? Broman KW, Murray JC, Sheffield

    VC, White RL, Weber JL (1998) Comprehensive human genetic maps: Individual and sex-specific variation in recombination. American Journal of Human Genetics 63:861-869, Figure 1 http://www.biostat.wisc.edu/~kbroman/topten_worstgraphs/
  13. Judicious use of space for legends ❖ In-chart legends often

    waste space http://vis4.net/blog/posts/doing-the-line-charts-right/
  14. Aspect ratio ❖ “In his text Visualizing Data, William Cleveland

    demonstrates how the aspect ratio of a line chart can affect an analyst's perception of trends in the data. Cleveland proposes an optimization technique for computing the aspect ratio such that the average absolute orientation of line segments in the chart is equal to 45 degrees. This technique, called banking to 45 degrees, is designed to maximize the discriminability of the orientations of the line segments in the chart.” http://vis.berkeley.edu/papers/banking/ Two plots of monthly atmospheric carbon dioxide measurements, taken from 1959 to 1990. The first plot, with an aspect ratio of 1.17, reveals an accelerating increase in CO2 levels. The second plot, with an aspect ratio of 7.87, facilitates closer inspection of seasonal fluctuations, revealing a gradual attack followed by a steeper decay. These aspect ratios were automatically determined using multi-scale banking.
  15. Do not exceed resolution of visual acuity Krzywinski, M., Brol,

    I., Jones, S., & Marra, M. (2012). Getting into visualization of large biological data sets: 20 imperatives of information design. Poster presented at 2nd IEEE Symposium on Biological Data Visualization (BioVis 2012), Seattle, WA. ❖ Human eye acuity is ~50 cycles/degree or about 1/200 (0.3 pt) at 10 inches
  16. Do not exceed resolution of visual acuity ❖ Human eye

    acuity is ~50 cycles/degree or about 1/200 (0.3 pt) at 10 inches Krzywinski, M., Brol, I., Jones, S., & Marra, M. (2012). Getting into visualization of large biological data sets: 20 imperatives of information design. Poster presented at 2nd IEEE Symposium on Biological Data Visualization (BioVis 2012), Seattle, WA.
  17. Show variation with statistics Krzywinski, M., Brol, I., Jones, S.,

    & Marra, M. (2012). Getting into visualization of large biological data sets: 20 imperatives of information design. Poster presented at 2nd IEEE Symposium on Biological Data Visualization (BioVis 2012), Seattle, WA. Approaches to encoding min/avg/max values of downsampled data. In the top hi-low trace, the vertical bars are perceived as a separate layer and effectively show variance without obscuring trends in the average.
  18. Use non-linear scales when needed Krzywinski, M., Brol, I., Jones,

    S., & Marra, M. (2012). Getting into visualization of large biological data sets: 20 imperatives of information design. Poster presented at 2nd IEEE Symposium on Biological Data Visualization (BioVis 2012), Seattle, WA. When drawing the position and size of densely packed genes, encode the gene’s size using a non-linear mapping. When the number of data values is large, such as in the OMIM gene track, hollow glyphs are effective. For even greater number of points, a density map is preferred. chr 1 <10 10-30 30-50 50-100 100-200 >200 size (kb) RAD54L G>A rs121908690 RNASEL C>T rs74315365 EPHB2 SFPQ TPM3 PBX1 PAX7 RBM15 BCL9 PRCC PRRX1 ABL2 LHX4 CDC73 LCK MYCL1 MUTYH TAL1 BCL10 CSF1 CSDE1 ARNT RIT1 NTRK1 TPR PRG4 CANCER CENSUS SNP OMIM 50 100 150 200 Mb
  19. Aggregate data Krzywinski, M., Brol, I., Jones, S., & Marra,

    M. (2012). Getting into visualization of large biological data sets: 20 imperatives of information design. Poster presented at 2nd IEEE Symposium on Biological Data Visualization (BioVis 2012), Seattle, WA. 12 54 82 29 25 22 67 61 23 79 ed theme. What is communicated? (A) The raw data imparts no clear message.(B). Binning indicates ranges, not individual values, are important. (C). Frequency distribution suggests that there is a shortage of medium-sized values. (D) Individual data RQKPVUECPDGTGOQXGFVQGORJCUK\GVTGPFCPFUKIPKƁECPEG 0-30 31-60 61-100 30 60 * A B C D 30 60 29 25 23 22 12 54 82 79 67 61