Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Seven Sins of Data Science Newbie
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
_themessier
March 10, 2018
Technology
0
99
Seven Sins of Data Science Newbie
Presented at WiDS Mumbai 2018
_themessier
March 10, 2018
Tweet
Share
More Decks by _themessier
See All by _themessier
Steer Thy Language Model
_themessier
0
16
Thesis Presentation
_themessier
0
73
Proactive_Mitigation_Detox_ICWSM
_themessier
0
10
An overview of hate speech analysis techniques in NLP
_themessier
0
100
Probing Critical Learning Dynamics of PLMs for Hate Speech Detection
_themessier
0
120
Google Bindi: Hateful Signals and where to find them?
_themessier
0
87
Hateful Signals In Indic Context and Where to Find Them
_themessier
0
110
NLP With Friends
_themessier
0
130
Revisiting Hate Speech Benchmarks KDD 2023
_themessier
0
160
Other Decks in Technology
See All in Technology
Context Engineeringの取り組み
nutslove
0
380
顧客の言葉を、そのまま信じない勇気
yamatai1212
1
360
Bill One急成長の舞台裏 開発組織が直面した失敗と教訓
sansantech
PRO
2
400
今こそ学びたいKubernetesネットワーク ~CNIが繋ぐNWとプラットフォームの「フラッと」な対話
logica0419
5
370
ファインディの横断SREがTakumi byGMOと取り組む、セキュリティと開発スピードの両立
rvirus0817
1
1.6k
GitHub Issue Templates + Coding Agentで簡単みんなでIaC/Easy IaC for Everyone with GitHub Issue Templates + Coding Agent
aeonpeople
1
260
Cloud Runでコロプラが挑む 生成AI×ゲーム『神魔狩りのツクヨミ』の裏側
colopl
0
130
Oracle Cloud Observability and Management Platform - OCI 運用監視サービス概要 -
oracle4engineer
PRO
2
14k
FinTech SREのAWSサービス活用/Leveraging AWS Services in FinTech SRE
maaaato
0
130
30万人の同時アクセスに耐えたい!新サービスの盤石なリリースを支える負荷試験 / SRE Kaigi 2026
genda
4
1.4k
モダンUIでフルサーバーレスなAIエージェントをAmplifyとCDKでサクッとデプロイしよう
minorun365
4
220
私たち準委任PdEは2つのプロダクトに挑戦する ~ソフトウェア、開発支援という”二重”のプロダクトエンジニアリングの実践~ / 20260212 Naoki Takahashi
shift_evolve
PRO
2
190
Featured
See All Featured
Avoiding the “Bad Training, Faster” Trap in the Age of AI
tmiket
0
79
Faster Mobile Websites
deanohume
310
31k
Mind Mapping
helmedeiros
PRO
0
90
The Illustrated Guide to Node.js - THAT Conference 2024
reverentgeek
0
260
Un-Boring Meetings
codingconduct
0
200
技術選定の審美眼(2025年版) / Understanding the Spiral of Technologies 2025 edition
twada
PRO
117
110k
brightonSEO & MeasureFest 2025 - Christian Goodrich - Winning strategies for Black Friday CRO & PPC
cargoodrich
3
100
For a Future-Friendly Web
brad_frost
182
10k
Making the Leap to Tech Lead
cromwellryan
135
9.7k
Leveraging Curiosity to Care for An Aging Population
cassininazir
1
170
The Curious Case for Waylosing
cassininazir
0
240
HU Berlin: Industrial-Strength Natural Language Processing with spaCy and Prodigy
inesmontani
PRO
0
230
Transcript
Seven Sins of a Newbie Data Science (and how not
to commit them) - Sarah Masud, Red Hat
About Me Github: sara-02 Blog: themessier.wordpress.com
Me Learning To Give Back: 1. Open Source Contributions 2.
Blogs 3. Meetups, Conferences 4. Mentorship 5. Program review committees
Let’s begin ;) Image: https://commons.wikimedia.org/wiki/File:DataScienceLogo.png
Image: https://chroniclesofanassistant.wordpress.com/2010/11/14/first-day-of-work/
Image: https://www.kdnuggets.com/2016/10/big-data-science-expectation-reality.html
1: The Problem Statement At College: “On a loan data-set,
using logistic regression determine if person will default or not.”
1: The Problem Statement At Work: “We have been collecting
these data points since past 3 years. See what can be done to monetize it.”
1: The Problem Statement Solution 1. Understand the business needs!
2. Then understand the data collected. 3. Finally translate the vague problem into a known one.
2: Show Me the data At College: “Use the data
from Kaggle, UCLA registry, Image-Net, Wikipedia...”
Image: https://me.me/i/show-me-the-data-9747283
2: Show Me the data At Work: “Use whatever data
is legally available, but get this problem solved!”
2: Show Me the data Solution: 1. Don’t expect someone
to give you the data willingly! 2. Learn to deal with lack of labelled data. 3. Learn Web Scraping/Data ingestion pipelines.
3. Using A Missile Gun To Kill The Chicken At
College: “Sounds cool! Let me use this SOTA algorithm.”
Image: https://pbs.twimg.com/media/B83v847CUAAQHKg.jpg:large
3. Using A Missile Gun To Kill The Chicken At
Work: “Provide us with a cheap, accurate, stable solution.”
Image: https://www.someecards.com/usercards/viewcard/if-you-torture-the-data-they-will-confess-94dd7
3. Using A Missile Gun To Kill The Chicken Solutions:
1. Not every problem needs to be a DS problem! 2. Use switch cases if that is enough. 3. Understand the business constraints.
4: The Value of Your Work At College: 1. Accuracy
of model. 2. Number of research papers. 3. Subject grade!
4: The Value of Your Work At work 1. RoI.
2. RoI. 3. RoI.
Image: https://me.me/i/show-me-the-money-memes-11885126
4: The Value of Your Work Solution: 1. Understand the
business. 2. Optimise for Accuracy vs Cost. 3. Keep the end user in mind.
5: Serving the model At College “It about building most
accurate system, running it from the terminal. And that is it!”
5: Serving the model At Work: 1. How many concurrent
users can we serve? 2. What time delay can we afford, before we lose the customer?
5: Serving the model Industry: 1. How is the model
exposed to UI? 2. Can the model be distributed? 3. Can the model scale with increase in data?
6. Know Thy Audience At College: “Technical mentors, peers.”
6. Know Thy Audience At Work: “Audience is always a
mixed Baggage.”
6. Know Thy Audience Solution: 1. Know you concepts well.
2. Teaching DS to your grandma style of conversations.
Image: http://www.combine-lab.com/if-you-cant-explain-it-simply-you-dont-understand-it-well-enough/
7. Entropy sets in At College: “Build once, use once,
and then forget it!”
7. Entropy sets in At Work: “The same model and
code can be used in production for years without replacement.”
7. Entropy sets in Solution: 1. Build scalable robust models.
2. Perform regular model evaluation. 3. Re-train the model from time to time.
Love the problem, not your solution. Learn to Unlearn →
Relearn → Remodel. BECAUSE ...
Image: https://www.cafepress.com/+entropy_always_wins_3_shot_glass,1289685014
Thank You Q & A