Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Atelier Datalab - volet technique
Search
Providenz - Laurent Paoletti
September 29, 2014
Technology
0
74
Atelier Datalab - volet technique
Stockage, analyse, visualisation de données et machine learning
Providenz - Laurent Paoletti
September 29, 2014
Tweet
Share
More Decks by Providenz - Laurent Paoletti
See All by Providenz - Laurent Paoletti
Introduction au machine learning
providenz
0
200
Des builds front plus rapides
providenz
0
48
Back to front
providenz
0
140
Machine Learning for the rest of us
providenz
1
190
Brunch, le builder pour les developpeurs pressés
providenz
0
160
Postgresql la plateforme de vos données
providenz
0
260
Performance web (Brown bag lunch)
providenz
0
40
Montée en charge
providenz
0
36
Présentation de django
providenz
0
43
Other Decks in Technology
See All in Technology
神回のメカニズムと再現方法/Mechanisms and Playbook for Kamikai scrumat2025
moriyuya
4
690
SREとソフトウェア開発者の合同チームはどのようにS3のコストを削減したか?
muziyoshiz
1
200
The Cake Is a Lie... And So Is Your Login’s Accessibility
leichteckig
0
100
Simplifying Cloud Native app testing across environments with Dapr and Microcks
salaboy
0
120
「れきちず」のこれまでとこれから - 誰にでもわかりやすい歴史地図を目指して / FOSS4G 2025 Japan
hjmkth
1
100
Modern_Data_Stack最新動向クイズ_買収_AI_激動の2025年_.pdf
sagara
0
240
生成AIとM5Stack / M5 Japan Tour 2025 Autumn 東京
you
PRO
0
240
研究開発部メンバーの働き⽅ / Sansan R&D Profile
sansan33
PRO
3
20k
いまさら聞けない ABテスト入門
skmr2348
1
230
『バイトル』CTOが語る! AIネイティブ世代と切り拓くモノづくり組織
dip_tech
PRO
1
110
AI時代だからこそ考える、僕らが本当につくりたいスクラムチーム / A Scrum Team we really want to create in this AI era
takaking22
7
4k
許しとアジャイル
jnuank
1
140
Featured
See All Featured
Music & Morning Musume
bryan
46
6.8k
Docker and Python
trallard
46
3.6k
Building a Scalable Design System with Sketch
lauravandoore
462
33k
Designing for humans not robots
tammielis
254
26k
Optimising Largest Contentful Paint
csswizardry
37
3.4k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4k
Documentation Writing (for coders)
carmenintech
75
5k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
23
1.5k
How to train your dragon (web standard)
notwaldorf
96
6.3k
Building Applications with DynamoDB
mza
96
6.7k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
7
900
Bootstrapping a Software Product
garrettdimon
PRO
307
110k
Transcript
DATALAB l ’atelier Laurent Paoletti @providenz TVT - 29 septembre
2014
DATA BIG DATA DATASCIENCE définitions
VOLUME VÉLOCITÉ VARIÉTÉ COMPLEXITÉ critères
DONNÉES STRUCTURÉES SEMI-STRUCTURÉES NON STRUCTURÉES typologie
TEXTE HORODATEES GÉOGRAPHIQUES SCIENCE - FINANCE LOGS GRAPHE IMAGE/SON/VIDEO typologie
OPENDATA SERVICES - API ORGANIQUE CROWDSOURCING OBJETS CONNECTÉS ACHAT SCRAPING
- EXTRACTION sources
sources - api
HOME SERVEUR(S) CLOUD CUSTOM ! GPU FPGA plateformes -infrastructure
FICHIERS excel csv hdf5 plateformes -persistance
DB RELATIONELLES ! MYSQL POSTGRESQL SQLSERVER, ORACLE plateformes -persistance
SIG:POSTGIS plateformes -persistance
GRAPHES: NEO4J plateformes -persistance
RECHERCHE : ELASTICSEARCH plateformes -persistance
HADOOP SPARK HBASE plateformes -persistance
MAP-REDUCE plateformes -persistance
EXTRACTION NETTOYAGE ETL analyse - préparation
FILTRAGE TRANSFORMATION STATISTIQUES analyse
R SQL PYTHON OPENREFINE analyse - outils
« capacité qu’on donne à une machine d’ingérer des données
à apprendre et de s’enrichir grâce à son expérience » machine learning
machine learning ANTI-SPAM RECOMMANDATIONS SCORING OPTIMISATION DE PRIX IDENTIFICATION
TRAINING DATA machine learning 101
machine learning 101
machine learning 101 setosa
machine learning 101
machine learning 101 DATASET MODELE DATA PREDICTION apprentissage humain
« For a long time, we thought that Tamoxifen was
roughly 80% effective for breast cancer patients. But now we know much more: we know that it’s 100% effective in 70% to 80% of the patients, and ineffective in the rest. » ! machine learning 101
machine learning regression classification !
machine learning - outils R JAVA PYTHON SAAS ! !
visualisation http://flowingdata.com/page/2/
http://www.brightpointinc.com/interactive/political_influence/index.html?source=d3js WEB visualisation
http://www.brightpointinc.com/interactive/political_influence/index.html?source=d3js visualisation
EXCEL - GNUPLOT PYTHON - MATPLOTLIB WEB - D3.JS !
! visualisation - outils
Général: http://www.oreilly.com/data/ Pandas: http://pandas.pydata.org/ R: http://www.r-project.org/ Python: https://www.python.org/ Machine learning:
http://scikit-learn.org/ Openrefine: http://openrefine.org/ Postgis: http://postgis.net/ Elasticsearch: http://www.elasticsearch.org/ Hadoop: http://hadoop.apache.org/ Spark: https://spark.apache.org/ Hbase: http://hbase.apache.org/ D3: http://d3js.org/ Bigml: https://bigml.com/ Prediction API: https://cloud.google.com/prediction/?hl=fr ressources
merci Laurent Paoletti @providenz TVT - 29 septembre 2014