Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Why your field needs a hack week

Why your field needs a hack week

The skills that broadly fall under the umbrella of data science are becoming increasingly important to modern-day science. As demands upon researchers to apply and implement new methodologies and data analysis techniques rise, training scientists in these methods is becoming more of a priority.

The week of this talk, BIDS is hosting Astro Hack Week, a five-day mix of summer school, unconference, and hackathon. It is one approach to improving data literacy in astronomy while at the same time providing a collaborative venue for researchers to share ideas and their approaches to problems, work on research projects, and learn new concepts that go beyond those generally taught in astronomy. In this talk, I will give an overview of the ideas that underlie Astro Hack Week, highlight some results from this week’s workshop, and try to convince you that your field needs a hack week too.

Daniela Huppenkothen

October 17, 2016
Tweet

More Decks by Daniela Huppenkothen

Other Decks in Science

Transcript

  1. Why your field needs a _____ hack week Daniela Huppenkothen

    NYU Center for Data Science NYU Center for Cosmology and Particle Physics ! Tiana_Athriel " dhuppenkothen astro bio politics chemistry social geo physics
  2. • 5 days • 50 researchers • all of astronomy

    represented • tutorials, lectures, hacking • 28 projects in progress #AstroHackWeek @ BIDS
  3. Data is exploding 700 600 500 400 IRSA holdings (TB)

    300 200 100 2008 0 2009 IRSA general Spitzer WISE 2010 2011 2012 2013 2014 Growth in the Scientific Data Holdings of IRSA, Projected to 2014 Chart courtesy of IRSA
  4. • teach new data science methods • increase connectedness •

    foster exchange between disciplines • build networks • promote open science Objectives
  5. “hack” A hack is a small project with a very

    clear goal, which should be completed by the end of the time initially allocated to it
  6. We want a diverse set of participants from different backgrounds

    and all academic subdisciplines of astronomy, and also a wide range of technical backgrounds from beginners to experts in any of the topics we are interested in teaching and working on, and also a wide range of academic seniority
  7. ???

  8. “What did you learn at Astro Hack Week?” “Machine learning

    (and where the resources are if I want to learn more)“ “Not to be too afraid of Bayesian methods“ “The evening "debate" about corrected an important misunderstanding I had about model selection”
  9. “What did you learn at Astro Hack Week?” “[…] learned

    about new tools to use, new methods to try, and a lot of good practices” “team coding” “I was really able to improve my programming skills and habits after this workshop” “I learned a lot about profiling and commenting code, and methods for improving performance”
  10. “What did you learn at Astro Hack Week?” “The main

    thing I took from AstroHackWeek is the fact that I'm not alone […]” “Working with Brewer on transdimensional sampling was extremely helpful for our current work.”
  11. Uncovering planets and stellar activity using radial velocities 88, A31

    (2016) er- es: e , We of by lly ve el- an he es nd re- en are .g. ms, ill 6). o- Np planets N data points µP vkep i Np µK e K P ⌘4 ⌘1 ⌘2 ⌘3 vsys ⌃ ! s ti wP vi i Fig. 1. Representation of the relations between parameters and obser- vations in our RV model, as a probabilistic graphical model. An arrow between two nodes indicates the direction of conditional dependence. The circled nodes are the parameters of the model, whose joint dis- tribution is sampled by the Markov chain Monte Carlo (MCMC) al- gorithm. The double circled node v i represents the observed RVs. The filled nodes represent deterministic variables: if these variables have parent nodes (vkep i and ⌃), they are given by a deterministic function of Faria, Haywood, Brewer et al, 2016 A&A 588, A31 0 1 2 3 4 5 6 7 8 9 10 Number of planets 0 500 1000 1500 2000 2500 3000 Number of posterior samples p(2) p(1) ¥ Œ p(3) p(2) ¥ 4.49 Fig. 2. Posterior distribution for the number of planets Np . The counts are number of posterior samples in models with a given number of planets. The two ratios of probabilities between models with 1, 2, and 3 planets are highlighted; note that p (0) = p (1) = 0. (se the pa am wh ur wh rit Pb ce aro str pe (se an
  12. Things we don’t know • How can we measure success?

    • What is the ideal set of participants for Astro Hack Week? • What is the right balance between learning and project work? • How can we effectively mitigate impostor syndrome?
  13. Should you run a hack week? we have more data

    than we know what to do with our data is complex we need better/new methods collaboration and networking is important to us we want to make open science a priority
  14. Resources • http://www.astrohackweek.org • https://github.com/AstroHackWeek • https://hackpad.com/Astro-Hack-Week-2016-Hack-Index- PlmbAyVqqtQ • https://danielahuppenkothen.wordpress.com

    • https://neurohackweek.github.io • https://geohackweek.github.io • http://github.com/dhuppenkothen/entrofy • https://medium.com/@dalcashdvinsky/the-horror-of-hack- days-52c6b52cfc3b#.biklgcpd3 • Come and talk to us!
  15. (special thanks to: Kyle Barbary, Phil Marshall, David W. Hogg,

    Jake Vanderplas, Brian McFee) Questions?