Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Seven Sins of Data Science Newbie
Search
_themessier
March 10, 2018
Technology
0
99
Seven Sins of Data Science Newbie
Presented at WiDS Mumbai 2018
_themessier
March 10, 2018
Tweet
Share
More Decks by _themessier
See All by _themessier
Steer Thy Language Model
_themessier
0
11
Thesis Presentation
_themessier
0
66
Proactive_Mitigation_Detox_ICWSM
_themessier
0
10
An overview of hate speech analysis techniques in NLP
_themessier
0
97
Probing Critical Learning Dynamics of PLMs for Hate Speech Detection
_themessier
0
120
Google Bindi: Hateful Signals and where to find them?
_themessier
0
87
Hateful Signals In Indic Context and Where to Find Them
_themessier
0
110
NLP With Friends
_themessier
0
130
Revisiting Hate Speech Benchmarks KDD 2023
_themessier
0
150
Other Decks in Technology
See All in Technology
Databricks向けJupyter Kernelでデータサイエンティストの開発環境をAI-Readyにする / Data+AI World Tour Tokyo After Party
genda
1
560
たまに起きる外部サービスの障害に備えたり備えなかったりする話
egmc
0
270
シニアソフトウェアエンジニアになるためには
kworkdev
PRO
3
180
ハッカソンから社内プロダクトへ AIエージェント「ko☆shi」開発で学んだ4つの重要要素
sonoda_mj
4
240
AI時代の新規LLMプロダクト開発: Findy Insightsを3ヶ月で立ち上げた舞台裏と振り返り
dakuon
0
220
日本Rubyの会: これまでとこれから
snoozer05
PRO
3
150
年間40件以上の登壇を続けて見えた「本当の発信力」/ 20251213 Masaki Okuda
shift_evolve
PRO
1
140
ExpoのインダストリーブースでみたAWSが見せる製造業の未来
hamadakoji
0
150
子育てで想像してなかった「見えないダメージ」 / Unforeseen "hidden burdens" of raising children.
pauli
2
290
Amazon Bedrock Knowledge Bases × メタデータ活用で実現する検証可能な RAG 設計
tomoaki25
2
260
AI 駆動開発勉強会 フロントエンド支部 #1 w/あずもば
1ftseabass
PRO
0
400
AWS運用を効率化する!AWS Organizationsを軸にした一元管理の実践/nikkei-tech-talk-202512
nikkei_engineer_recruiting
0
110
Featured
See All Featured
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
34
2.6k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
21
1.3k
The Cost Of JavaScript in 2023
addyosmani
55
9.4k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
249
1.3M
Unsuck your backbone
ammeep
671
58k
Mobile First: as difficult as doing things right
swwweet
225
10k
How Fast Is Fast Enough? [PerfNow 2025]
tammyeverts
3
390
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
31
3k
Leveraging LLMs for student feedback in introductory data science courses - posit::conf(2025)
minecr
0
81
Why You Should Never Use an ORM
jnunemaker
PRO
61
9.6k
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
46
2.6k
How People are Using Generative and Agentic AI to Supercharge Their Products, Projects, Services and Value Streams Today
helenjbeal
1
73
Transcript
Seven Sins of a Newbie Data Science (and how not
to commit them) - Sarah Masud, Red Hat
About Me Github: sara-02 Blog: themessier.wordpress.com
Me Learning To Give Back: 1. Open Source Contributions 2.
Blogs 3. Meetups, Conferences 4. Mentorship 5. Program review committees
Let’s begin ;) Image: https://commons.wikimedia.org/wiki/File:DataScienceLogo.png
Image: https://chroniclesofanassistant.wordpress.com/2010/11/14/first-day-of-work/
Image: https://www.kdnuggets.com/2016/10/big-data-science-expectation-reality.html
1: The Problem Statement At College: “On a loan data-set,
using logistic regression determine if person will default or not.”
1: The Problem Statement At Work: “We have been collecting
these data points since past 3 years. See what can be done to monetize it.”
1: The Problem Statement Solution 1. Understand the business needs!
2. Then understand the data collected. 3. Finally translate the vague problem into a known one.
2: Show Me the data At College: “Use the data
from Kaggle, UCLA registry, Image-Net, Wikipedia...”
Image: https://me.me/i/show-me-the-data-9747283
2: Show Me the data At Work: “Use whatever data
is legally available, but get this problem solved!”
2: Show Me the data Solution: 1. Don’t expect someone
to give you the data willingly! 2. Learn to deal with lack of labelled data. 3. Learn Web Scraping/Data ingestion pipelines.
3. Using A Missile Gun To Kill The Chicken At
College: “Sounds cool! Let me use this SOTA algorithm.”
Image: https://pbs.twimg.com/media/B83v847CUAAQHKg.jpg:large
3. Using A Missile Gun To Kill The Chicken At
Work: “Provide us with a cheap, accurate, stable solution.”
Image: https://www.someecards.com/usercards/viewcard/if-you-torture-the-data-they-will-confess-94dd7
3. Using A Missile Gun To Kill The Chicken Solutions:
1. Not every problem needs to be a DS problem! 2. Use switch cases if that is enough. 3. Understand the business constraints.
4: The Value of Your Work At College: 1. Accuracy
of model. 2. Number of research papers. 3. Subject grade!
4: The Value of Your Work At work 1. RoI.
2. RoI. 3. RoI.
Image: https://me.me/i/show-me-the-money-memes-11885126
4: The Value of Your Work Solution: 1. Understand the
business. 2. Optimise for Accuracy vs Cost. 3. Keep the end user in mind.
5: Serving the model At College “It about building most
accurate system, running it from the terminal. And that is it!”
5: Serving the model At Work: 1. How many concurrent
users can we serve? 2. What time delay can we afford, before we lose the customer?
5: Serving the model Industry: 1. How is the model
exposed to UI? 2. Can the model be distributed? 3. Can the model scale with increase in data?
6. Know Thy Audience At College: “Technical mentors, peers.”
6. Know Thy Audience At Work: “Audience is always a
mixed Baggage.”
6. Know Thy Audience Solution: 1. Know you concepts well.
2. Teaching DS to your grandma style of conversations.
Image: http://www.combine-lab.com/if-you-cant-explain-it-simply-you-dont-understand-it-well-enough/
7. Entropy sets in At College: “Build once, use once,
and then forget it!”
7. Entropy sets in At Work: “The same model and
code can be used in production for years without replacement.”
7. Entropy sets in Solution: 1. Build scalable robust models.
2. Perform regular model evaluation. 3. Re-train the model from time to time.
Love the problem, not your solution. Learn to Unlearn →
Relearn → Remodel. BECAUSE ...
Image: https://www.cafepress.com/+entropy_always_wins_3_shot_glass,1289685014
Thank You Q & A