Upgrade to Pro — share decks privately, control downloads, hide ads and more …

#datacareer - Analyst? Engineer? Scientist? Rol...

#datacareer - Analyst? Engineer? Scientist? Roles in industry and startups with Irena Bojarovska

“The problem for AI in Europe is not the money, it is finding the talent” (Leading European AI practitioner)

Irena Bojarovska is a data scientist with a passion for mathematics and automation. After finishing her PhD in 2015 she moved from academia to industry as a data analyst, and is now expanding her career horizons in the world of data science.

Data and Artificial Intelligence constitute the fastest-growing job market for the highly qualified. This workshop offers you the following:

- How to find the right role for you among the emerging specialized roles in e.g. data engineering, data analytics, data science, machine learning, and deep learning.
- Pragmatic advice on handling your CV and skills profile for your next role.
- Orientation on the labour market, what employers miss most, and which #aiusecase are winning.

Dânia Meira

August 31, 2020
Tweet

More Decks by Dânia Meira

Other Decks in Technology

Transcript

  1. #DATACAREER “No matter who you are, self-improvement is one of

    the most important and most overlooked attributes of young AI talent. It only takes four years of experience to become a senior AI researcher, or five years of experience to lead an entire institute. The determination and discipline to improve both the hard and soft skills continually will be the deciding factor in an AI researcher’s career.” Jean-François Gagné
  2. DÂNIA MEIRA Founding member, AI Guild ML models for predictive

    analytics Former bootcamp teacher #datacareer since 2012 LinkedIn
  3. IRENA BOJAROVSKA One of the first AI Guild members Juggling

    data at Air Berlin for 1.5 years Since 2017 at Zalando’s Marketing Data & Analytics department PhD in Applied Mathematics Mother of two daughters LinkedIn
  4. CHRIS ARMBRUSTER Founding member, AI Guild 10,000 Data Scientists for

    Europe Former bootcamp director #datacareer coaching since 2017 LinkedIn
  5. AI GUILD CAREER COACHING Running for junior and for senior

    practitioners since early 2019 Runs monthly for AI Guild members Coaching capacity per year: 240 participants
  6. INSIGHTS FROM CAREER COACHING Search for the 1st as well

    as the 2nd role may take >6 months Upgrading inside a company may be easier Job advertisements may be misleading and confusing The role ‘in real life’ may not match the talents expectations
  7. OBSERVING THE MARKET Specialization and differentiation of roles Rising value

    of domain expertise Experimental phase with PoC plays ending Increasing focus on deployment
  8. PRODUCTIONIZING MACHINE LEARNING ML Models Data Collection Data Quality Infrastructure

    Process Management Tools Monitoring Feature Extraction Analysis Data Preprocessing Parameter Configuration Offline Validation A/B Testing Data Engineer Data Scientist Data Analyst ML Engineer AI Researcher #dataroles See also: “Hidden Technical Debt in Machine Learning System” by Sculley et al, Google inc, 2015 Machine Resource Management Configuration Business Logic
  9. #DATAROLES Task Understand business case, build features to train predictive

    models to address such use cases Skill Statistics, SQL, programming (e.g. python, R), ML & DL techniques. Data Scientist Task Business and data understanding to report on what happens Skill Descriptive analytics, SQL, statistics, dashboarding and visualization tools Data Analyst Data Engineer Task Build and maintain infrastructure and pipeline to collect, clean and pre-process data Skill Distributed systems, databases, software engineering Task Optimize, deploy and maintain machine learning models in production Skill Software engineering, devOps and systems architecture Machine Learning Engineer Task Build new machine learning algorithms, find custom scientific solutions Skill Research, presenting at conferences, writing publications AI Researcher
  10. ‚COOKING‘ DATA: EXPLAINING SPECIALIZATION ML Models Data Collection Data Quality

    Infrastructure Process Management Tools Machine Resource Management Monitoring Configuration Feature Extraction Analysis Data Preprocessing Parameter Configuration Offline Validation Business Logic A/B Testing See also: Understanding a Machine Learning Workflow Through Food by Daniel Godoy Sowing Harvesting Choose recipe Prepare ingredients Customers tasting Kitchen Tasting Use utensils Try combinations of appliances and recipes Kitchen space and available appliances
  11. UNDERSTANDING #DATAROLES Build Kitchen Appliances Create and use recipes to

    cook Check quality of ingredients and recipes Process ingredients at scale Turn a recipe into many dishes served efficiently Data Engineer Data Scientist Data Analyst ML Engineer AI Researcher
  12. DS at Zalando PhD at TU Berlin Gymnasium in Macedonia

    2002 2006 2012 2016 2017 IRENA BOJAROVSKA - TIMELINE BsC & MsC in Russia Analyst at Air Berlin
  13. D e e p SKILLS SETS FOR DATA ROLES Data

    Engineer Data Scientist Data Analyst ML Engineer AI Researcher #dataroles Cross-discipline ML Algorithms Visualization Domain Expertise Programming SW Engineering Communication Tools Platforms Statistics
  14. Data Analyst SQL R, Python Descriptive statistics Hypothesis testing Probability

    distributions Regression Excel Tableau Data interpretation Logical approach Marketing Healthcare E-commerce Mobility Manufacturing ... Cross-discipline ML Algorithms Visualization Domain Expertise Programming SW Engineering Communication Tools Platforms Statistics
  15. Data Scientist Jupyter Lab, RStudio Git, Docker, Airflow, Jenkins SQL,

    R, Python Python: pandas, scikit-learn R: dplyr, forecast Regression Classification Clustering Deep Learning Marketing Healthcare E-commerce Mobility Manufacturing ... Cross-discipline ML Algorithms Visualization Domain Expertise Programming SW Engineering Communication Tools Platforms Statistics Data interpretation Logical approach
  16. WHICH ONE IS FOR YOU, BASED ON YOUR… Data Analyst

    BsC can be sufficient Presenting results (historical data) as facts Few and easy to learn Data Scientist PhD is often common Presenting predictions based on complex models SE knowledge and fast learning BACKGROUND INTERESTS SKILLS ↯ often it could be still called Data Analyst
  17. MORE OF.. Software engineering skills Mathematical and statistical knowledge Conferences

    LESS OF.. Direct participation in business decisions Stakeholder management Manual work REMAINS.. The passion for data The troubles of data The ability to help make data-based decisions TRANSITIONING FROM ANALYST TO SCIENTIST REQUIRES...
  18. IRENA ON TRANSITIONING FROM ANALYST TO SCIENTIST Moving internally is

    easier Automated reporting leaves space for learning new skills Being active in the community opens new doors Constant learning is essential
  19. Hadoop: Hive, Pig, Spark Databases Git, Docker, Airflow, Jenkins SQL,

    Bash, Java, Scala, Python Data pipelines Data structures Linux, AWS, Google Cloud Platform, Microsoft Azure Cross-discipline ML Algorithms Visualization Domain Expertise Programming SW Engineering Communication Tools Platforms Statistics Data Engineer
  20. ML Engineer Hadoop: Hive, Pig, Spark Git, Docker, Airflow, Jenkins

    SQL, Bash, Java, Scala, Python sk-learn Linux, AWS, Google Cloud Platform, Microsoft Azure Microservices Infrastructure Cross-discipline ML Algorithms Visualization Domain Expertise Programming SW Engineering Communication Tools Platforms Statistics
  21. KEY INDUSTRY CHALLENGES* ◼ Data volume, accessibility, and quality ◼

    Trust of customers, stakeholders, and employees, including governance, compliance, and reputation ◼ Competence of employees, management, and company *Based on the 2019 PWC report “Künstliche Intelligenz in Unternehmen”, p. 12
  22. SOME STARTUP CHALLENGES • Data volume, accessibility, and quality •

    Company funding and runway • Expertise levels and team size
  23. WRAPPING UP Keep observing the market Look for matches between

    employers’ needs and your skills profile Scan the industry and startups for the most promising #aiusecase