Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ctosummitdatasciencekeynote-150516193652-lva1-a...

Sponsored · Ship Features Fearlessly Turn features on and off without deploys. Used by thousands of Ruby developers.

 ctosummitdatasciencekeynote-150516193652-lva1-app6891.pdf

This 2015 keynote at the CTO Summit on Data Science emphasizes that the key to effective data science is identifying and framing the right problem. It discusses the importance of explainability in models and the iterative nature of building machine learning models. It highlights the need for optimizing the speed of learning and conducting experiments wisely in order to achieve better results.

Avatar for Daniel Tunkelang

Daniel Tunkelang

May 20, 2026

More Decks by Daniel Tunkelang

Other Decks in Technology

Transcript

  1. tl;dr The most important part of data science is picking

    the right problem and figuring out how to frame it.
  2. But nobody knows everything.* Class HashMap<K,V> java.lang.Object java.util.AbstractMap<K,V> java.util.HashMap<K,V> Type

    Parameters: K - the type of keys maintained by this map V - the type of mapped values All Implemented Interfaces: Serializable, Cloneable, Map<K,V> *Except Jeff Dean.
  3. Data science is a mindset. Explain Iterate using explainable models.

    Express Model your utility and inputs. Experiment Optimize for speed of learning.
  4. The importance of being explainable. • Algorithms can protect you

    from overfitting, but they can’t protect you from the biases you introduce. • Introspection into your models and features makes it easier for you and others to debug them. • Especially if you don’t completely trust your objective function or representativeness of your training data.
  5. Linear models? Decision trees? • Linear regression and decision trees

    favor explainability over accuracy, compared to more sophisticated models. • But size matters. If you have too many features or too deep a decision tree, you lose explainability. • You can always upgrade to a more sophisticated model when you trust your objective function and training data. • Build a machine learning model is an iterative process. Optimize for the speed of your own learning.
  6. How to find your prince. You have to kiss a

    lot of frogs to find one prince. So how can you find your prince faster? By finding more frogs and kissing them faster and faster. -- Mike Moran
  7. Think like an economist. Yesterday Experiments are expensive, choose hypotheses

    wisely. Today Experiments are cheap, do as many as you can!
  8. Test one variable at a time. • Autocomplete • Entity

    Tagging • Vertical Intent • # of Suggestions • Suggestion Order • Language • Query Construction • Ranking Model
  9. tl;dr The most important part of data science is picking

    the right problem and figuring out how to frame it.