Scientific Visualization

170b760e0147792d0140e59ec78e9893?s=47 Eitan Lees
November 07, 2018

Scientific Visualization

A talk I gave in the Department of Scientific Computing at Florida State University covering scientific visualization.

170b760e0147792d0140e59ec78e9893?s=128

Eitan Lees

November 07, 2018
Tweet

Transcript

  1. Scientific Visualization By Eitan Lees

  2. Scientific Visualization By Eitan Lees * * Mostly in python

  3. Scientific Visualization By Eitan Lees * * Mostly in python

    Data
  4. Scientific Visualization By Eitan Lees * * Mostly in python

    Data Plots
  5. By Eitan Lees Data Plots

  6. Scientific Visualization By Eitan Lees

  7. None
  8. Acknowledgements - Sachin Shanbhag for letting me explore - Nathan

    Crock for putting it all together - Much of today’s talk comes from Jake Vanderplas’ PyCon 2017 talk - Brian Granger and the Jupyter project
  9. Show and Tell

  10. So you want to make a plot?

  11. - Visit - Mongo - Vesta - Gnuplot - TikZ

    - Matlab - Matplotlib - Adobe Illustrator
  12. None
  13. Government Solutions

  14. None
  15. None
  16. Advantages: - All in one platform - Developed with scientific

    application in mind - Very popular for physical modeling Weakness: - Too much for simple plots - GUI forward - Stability
  17. Commercial Solutions

  18. Data Visualization

  19. None
  20. Python Visualization Landscape

  21. None
  22. Strengths: - Designed like MatLab: switching was easy

  23. Strengths: - Designed like MatLab: switching was easy - Many

    rendering backends
  24. Strengths: - Designed like MatLab: switching was easy - Many

    rendering backends - Can reproduce just about any plot (with a bit of effort)
  25. Strengths: - Designed like MatLab: switching was easy - Many

    rendering backends - Can reproduce just about any plot (with a bit of effort) - Well-tested, standard tool for over a decade
  26. None
  27. Strengths: - Designed like MatLab: switching was easy - Many

    rendering backends - Can reproduce just about any plot (with a bit of effort) - Well-tested, standard tool for over a decade Weaknesses: - API is imperative & often overly verbose - Poor support for web/interactive graphics - Often slow for large & complicated data
  28. Objective: Improve on the weaknesses of matplotlib (without sacrificing the

    strengths!)
  29. Building on Matplotlib Common Idea: Keep matplotlib as a versatile

    well tested backend, and provide a new domain specific API.
  30. Key Features: - Provides the DataFrame object - Also provides

    a simple API for plotting
  31. Key Features: - Provides the DataFrame object - Also provides

    a simple API for plotting - Recently more sophisticated statistical visualization tools have been added
  32. Key Features: - Like pandas, wraps matplotlib - Nice set

    of color palettes & plot styles - Focus on statistical visualization & modeling
  33. None
  34. None
  35. None
  36. But what about the internet?!?

  37. None
  38. None
  39. Common Idea: build a new API that produces a plot

    serialization (often JSON) that can be displayed in the browser (often in Jupyter notebooks)
  40. Bokeh

  41. None
  42. Bokeh Advantages: - Web view/interactivity - Handles large data and/or

    streaming datasets - Geographical visualizations - Fully open source Weakness: - Plotly has some paid features - Limited output formats - Smaller (but growing) community
  43. None
  44. None
  45. The elephant in the room

  46. None
  47. D3 is everywhere

  48. But working with D3 can be challenging ...

  49. “Simple” Barchart

  50. D3 is a Javascript package that streamlines the manipulation of

    objects on a webpage
  51. Vega is a detailed declarative specification for visualizations, built on

    D3
  52. Vega-Lite is a simpler declarative specification aimed at statistical visualization

  53. Altair is a Python API for creating Vega-Lite specifications

  54. None
  55. Bridging the gap

  56. None
  57. - Aggregates data and sends pixels - Can handle interactive

    visualization of billions of rows - Datasets themselves stored in objects that automatically produce intelligent visualizations
  58. Don’t care about the internet?

  59. Want more power?

  60. None
  61. None
  62. None
  63. The Python Visualization Landscape

  64. Toward Declarative Visualization

  65. What do you mean by declarative visualization?

  66. Example: Statistical Data Tidy data: i.e. rows are samples, columns

    are features
  67. Tidy data: i.e. rows are samples, columns are features “

    I want to scatter petal length vs. sepal length, and color by species” Example: Statistical Data
  68. None
  69. None
  70. Problem: We’re mixing the what with the how

  71. Toward a well-motivated Declarative Visualization Imperative - Specify How something

    should be done. - Specification & Execution intertwined. - “Put a red circle here and a blue circle here” Declarative - Specify What should be done. - Separates Specification from Execution - “Map <x> to a position, and <y> to a color” Declarative visualizations lets you think about the data and relationships, rather than incidental details
  72. Toward a well-motivated Declarative Visualization Imperative - Specify How something

    should be done. - Specification & Execution intertwined. - “Put a red circle here and a blue circle here” Declarative - Specify What should be done. - Separates Specification from Execution - “Map <x> to a position, and <y> to a color” Declarative visualizations lets you think about the data and relationships, rather than incidental details
  73. Based on the Vega and Vega-Lite grammars https://altair-viz.github.io/ Altair Declarative

    visualization in Python
  74. Altair for Statistical Visualization

  75. Encodings are Flexible:

  76. Altair is Interactive

  77. ...

  78. ... Altair Tutorial THIS Friday in THIS room from 3:30-4:30

    pm