Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Atelier Datalab - volet technique
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Providenz - Laurent Paoletti
September 29, 2014
Technology
0
76
Atelier Datalab - volet technique
Stockage, analyse, visualisation de données et machine learning
Providenz - Laurent Paoletti
September 29, 2014
Tweet
Share
More Decks by Providenz - Laurent Paoletti
See All by Providenz - Laurent Paoletti
Introduction au machine learning
providenz
0
210
Des builds front plus rapides
providenz
0
50
Back to front
providenz
0
150
Machine Learning for the rest of us
providenz
1
190
Brunch, le builder pour les developpeurs pressés
providenz
0
160
Postgresql la plateforme de vos données
providenz
0
270
Performance web (Brown bag lunch)
providenz
0
44
Montée en charge
providenz
0
47
Présentation de django
providenz
0
45
Other Decks in Technology
See All in Technology
レガシー共有バッチ基盤への挑戦 - SREドリブンなリアーキテクチャリングの取り組み
tatsukoni
0
220
usermode linux without MMU - fosdem2026 kernel devroom
thehajime
0
240
Bill One急成長の舞台裏 開発組織が直面した失敗と教訓
sansantech
PRO
2
380
AWS Network Firewall Proxyを触ってみた
nagisa53
1
240
小さく始めるBCP ― 多プロダクト環境で始める最初の一歩
kekke_n
1
450
Codex 5.3 と Opus 4.6 にコーポレートサイトを作らせてみた / Codex 5.3 vs Opus 4.6
ama_ch
0
180
セキュリティについて学ぶ会 / 2026 01 25 Takamatsu WordPress Meetup
rocketmartue
1
310
M&A 後の統合をどう進めるか ─ ナレッジワーク × Poetics が実践した組織とシステムの融合
kworkdev
PRO
1
470
プロダクト成長を支える開発基盤とスケールに伴う課題
yuu26
4
1.3k
FinTech SREのAWSサービス活用/Leveraging AWS Services in FinTech SRE
maaaato
0
130
会社紹介資料 / Sansan Company Profile
sansan33
PRO
15
400k
SREのプラクティスを用いた3領域同時 マネジメントへの挑戦 〜SRE・情シス・セキュリティを統合した チーム運営術〜
coconala_engineer
2
670
Featured
See All Featured
SERP Conf. Vienna - Web Accessibility: Optimizing for Inclusivity and SEO
sarafernandez
1
1.3k
AI Search: Where Are We & What Can We Do About It?
aleyda
0
7k
How to Talk to Developers About Accessibility
jct
2
130
Darren the Foodie - Storyboard
khoart
PRO
2
2.4k
What the history of the web can teach us about the future of AI
inesmontani
PRO
1
430
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1.2k
The Straight Up "How To Draw Better" Workshop
denniskardys
239
140k
Unlocking the hidden potential of vector embeddings in international SEO
frankvandijk
0
170
The B2B funnel & how to create a winning content strategy
katarinadahlin
PRO
1
280
SEO in 2025: How to Prepare for the Future of Search
ipullrank
3
3.3k
Utilizing Notion as your number one productivity tool
mfonobong
3
220
The innovator’s Mindset - Leading Through an Era of Exponential Change - McGill University 2025
jdejongh
PRO
1
93
Transcript
DATALAB l ’atelier Laurent Paoletti @providenz TVT - 29 septembre
2014
DATA BIG DATA DATASCIENCE définitions
VOLUME VÉLOCITÉ VARIÉTÉ COMPLEXITÉ critères
DONNÉES STRUCTURÉES SEMI-STRUCTURÉES NON STRUCTURÉES typologie
TEXTE HORODATEES GÉOGRAPHIQUES SCIENCE - FINANCE LOGS GRAPHE IMAGE/SON/VIDEO typologie
OPENDATA SERVICES - API ORGANIQUE CROWDSOURCING OBJETS CONNECTÉS ACHAT SCRAPING
- EXTRACTION sources
sources - api
HOME SERVEUR(S) CLOUD CUSTOM ! GPU FPGA plateformes -infrastructure
FICHIERS excel csv hdf5 plateformes -persistance
DB RELATIONELLES ! MYSQL POSTGRESQL SQLSERVER, ORACLE plateformes -persistance
SIG:POSTGIS plateformes -persistance
GRAPHES: NEO4J plateformes -persistance
RECHERCHE : ELASTICSEARCH plateformes -persistance
HADOOP SPARK HBASE plateformes -persistance
MAP-REDUCE plateformes -persistance
EXTRACTION NETTOYAGE ETL analyse - préparation
FILTRAGE TRANSFORMATION STATISTIQUES analyse
R SQL PYTHON OPENREFINE analyse - outils
« capacité qu’on donne à une machine d’ingérer des données
à apprendre et de s’enrichir grâce à son expérience » machine learning
machine learning ANTI-SPAM RECOMMANDATIONS SCORING OPTIMISATION DE PRIX IDENTIFICATION
TRAINING DATA machine learning 101
machine learning 101
machine learning 101 setosa
machine learning 101
machine learning 101 DATASET MODELE DATA PREDICTION apprentissage humain
« For a long time, we thought that Tamoxifen was
roughly 80% effective for breast cancer patients. But now we know much more: we know that it’s 100% effective in 70% to 80% of the patients, and ineffective in the rest. » ! machine learning 101
machine learning regression classification !
machine learning - outils R JAVA PYTHON SAAS ! !
visualisation http://flowingdata.com/page/2/
http://www.brightpointinc.com/interactive/political_influence/index.html?source=d3js WEB visualisation
http://www.brightpointinc.com/interactive/political_influence/index.html?source=d3js visualisation
EXCEL - GNUPLOT PYTHON - MATPLOTLIB WEB - D3.JS !
! visualisation - outils
Général: http://www.oreilly.com/data/ Pandas: http://pandas.pydata.org/ R: http://www.r-project.org/ Python: https://www.python.org/ Machine learning:
http://scikit-learn.org/ Openrefine: http://openrefine.org/ Postgis: http://postgis.net/ Elasticsearch: http://www.elasticsearch.org/ Hadoop: http://hadoop.apache.org/ Spark: https://spark.apache.org/ Hbase: http://hbase.apache.org/ D3: http://d3js.org/ Bigml: https://bigml.com/ Prediction API: https://cloud.google.com/prediction/?hl=fr ressources
merci Laurent Paoletti @providenz TVT - 29 septembre 2014