Upgrade to Pro — share decks privately, control downloads, hide ads and more …

#datacareer - Analyst? Engineer? Scientist? Roles in industry and startups with senior data engineer Ellen König

#datacareer - Analyst? Engineer? Scientist? Roles in industry and startups with senior data engineer Ellen König

“The problem for AI in Europe is not the money, it is finding the talent” (Leading European AI practitioner)

Data and Artificial Intelligence constitute the fastest-growing job market for the highly qualified. This workshop offers you the following:

- How to find the right role for you among the emerging specialized roles in e.g. data engineering, data analytics, data science, machine learning, and deep learning.
- Pragmatic advice on handling your CV and skills profile for your next role.
- Orientation on the labour market, what employers miss most, and which #aiusecase are winning.

Dânia Meira

July 06, 2020
Tweet

More Decks by Dânia Meira

Other Decks in Technology

Transcript

  1. #DATACAREER “No matter who you are, self-improvement is one of

    the most important and most overlooked attributes of young AI talent. It only takes four years of experience to become a senior AI researcher, or five years of experience to lead an entire institute. The determination and discipline to improve both the hard and soft skills continually will be the deciding factor in an AI researcher’s career.” Jean-François Gagné
  2. Dânia Meira Founding member, AI Guild ML models for predictive

    analytics Former bootcamp teacher #datacareer since 2012 LinkedIn
  3. Ellen König • Senior Data Engineer, ThoughtWorks • 3 years

    in Data Engineering, 3 in Data Science, 5 in Software Engineering • Worked in industry (consulting, startups) and non-profits • MSc. in Computer Science Twitter and Website
  4. Chris Armbruster Founding member, AI Guild 10,000 Data Scientists for

    Europe Former bootcamp director #datacareer coaching since 2017 LinkedIn
  5. AI GUILD CAREER COACHING Running for junior and for senior

    practitioners since early 2019 Runs monthly for AI Guild members Coaching capacity per year: 240 participants
  6. INSIGHTS FROM CAREER COACHING Search for the 1st as well

    as the 2nd role may take >6 months Upgrading inside a company may be easier Job advertisements may be misleading and confusing The role ‘in real life’ may not match the talents expectations
  7. OBSERVING THE MARKET Specialization and differentiation of roles Rising value

    of domain expertise Experimental phase with PoC plays ending Increasing focus on deployment
  8. PRODUCTIONIZING MACHINE LEARNING ML Models Data Collection Data Quality Infrastructure

    Process Management Tools Machine Resource Management Monitoring Configuration Feature Extraction Analysis Data Preprocessing Parameter Configuration Offline Validation Business Logic A/B Testing Data Engineer Data Scientist Data Analyst ML Engineer AI Researcher #dataroles See also: “Hidden Technical Debt in Machine Learning System” by Sculley et al, Google inc, 2015
  9. #DATAROLES Task Understand business case, build features to train predictive

    models to address such use cases Skill Statistics, SQL, programming (e.g. python, R), ML & DL techniques. Data Scientist Task Business and data under- standing to report on what happens Skill Descriptive analytics, SQL, statistics, dashboarding and visualization tools Data Analyst Data Engineer Task Build and maintain infra- structure and pipeline to collect, clean and pre-process data Skill Distributed systems, databases, software engineering Task Optimize, deploy and maintain machine learning models in production Skill Software engineering, devops and systems architecture Machine Learning Engineer Task Build new machine learning algorithms, find custom scientific solutions Skill Research, presenting at conferences, writing publications AI Researcher
  10. ‚COOKING‘ DATA: EXPLAINING SPECIALIZATION ML Models Data Collection Data Quality

    Infrastructure Process Management Tools Machine Resource Management Monitoring Configuration Feature Extraction Analysis Data Preprocessing Parameter Configuration Offline Validation Business Logic A/B Testing See also: Understanding a Machine Learning Workflow Through Food by Daniel Godoy Sowing Harvesting Choose recipe Prepare ingredients Customers tasting Kitchen Tasting Use utensils Try combinations of appliances and recipes Kitchen space and available appliances
  11. UNDERSTANDING #DATAROLES Build Kitchen Appliances Create and use recipes to

    cook Check quality of ingredients and recipes Process ingredients at scale Turn a recipe into many dishes served efficiently Data Engineer Data Scientist Data Analyst ML Engineer AI Researcher
  12. WHY CONSIDER DATA ENGINEERING? You enjoy coding more than modeling

    or analysis You like to think about the quality of your code Complex technology is more intriguing to you than complex maths You prefer collaborating intensely with other people over strong autonomy You prefer being closer to the technology than the product and business You love buildings things that work and figuring out why things don’t work
  13. D e e p Data Engineer Data Scientist Data Analyst

    ML Engineer AI Researcher Data Roles Cross-discipline ML Algorithms Visualization Domain Expertise Programming SW Engineering Communication Tools Platforms Statistics SKILLS SETS FOR DATA ROLES
  14. Data Engineer Hadoop: Hive, Pig, Spark Databases Git, Docker, Airflow,

    Jenkins SQL, Bash, Java, Scala, Python Cross-discipline ML Algorithms Visualization Domain Expertise Programming SW Engineering Communication Tools Platforms Data pipelines Data structures Statistics Linux, AWS, Google Cloud Platform, Microsoft Azure
  15. ELLEN ON GETTING STARTED IN DATA ENGINEERING Consists of four

    technical skill areas (in my order of importance!) 1. Software engineering foundations 2. Data modeling and analysis 3. Cloud infrastructure and operations 4. „Big“ data technologies/distributed data systems (Spark, Flink, Hadoop, Kafka etc.)
  16. BUILDING SKILLS Start with small projects or projects-based classes/tutorials Check

    out this thread on Twitter https://twitter.com/ellen_koenig/status/ 1261571177354592256
  17. CRUCIAL SKILLS AND CONCEPTS OF SOFTWARE ENGINEERING ¡ Testing ¡

    Version Control ¡ Writing clean code (understandable and maintainable) ¡ Deployment ¡ Monitoring and logging ¡ Software design ¡ Software architecture
  18. STANDARDS AND QUALITY Data engineers build production software! We are

    held to the same standards as other production software engineers
  19. Cross-discipline ML Algorithms Visualization Domain Expertise Programming SW Engineering Communication

    Tools Platforms Data Analyst SQL R, Python Descriptive statistics Hypothesis testing Probability distributions Regression Statistics Excel, Tableau Data interpretation Logical approach Marketing Healthcare E-commerce Mobility Manufacturing ...
  20. Cross-discipline ML Algorithms Visualization Domain Expertise Programming SW Engineering Communication

    Tools Platforms ML Engineer Statistics Hadoop: Hive, Pig, Spark Git, Docker, Airflow, Jenkins SQL, Bash, Java, Scala, Python sk-learn MLlib Linux, AWS, Google Cloud Platform, Microsoft Azure Microservices Infrastructure
  21. Cross-discipline ML Algorithms Visualization Domain Expertise Programming SW Engineering Communication

    Tools Platforms Data Scientist Statistics Jupyter Lab, RStudio Git, Docker, Airflow, Jenkins SQL, Bash, R, Scala, Python Python: numpy, scipy, pandas, matplotlib, scikit-learn, keras, Prophet, NLTK, gensim R: dplyr, sqldf, tidyr, magrittr, lubridate, shiny, ggplot2, forecast, MLR, ranger, xgboost, Prophet MLlib Linux, AWS, Google Cloud Platform, Microsoft Azure Python: matplotlib, seaborn R: shiny, ggplot2 Marketing Healthcare E-commerce Mobility Manufacturing ...
  22. KEY INDUSTRY CHALLENGES* ¡ Data volume, accessibility, and quality ¡

    Trust of customers, stakeholders, and employees, including governance, compliance, and reputation ¡ Competence of employees, management, and company *Based on the 2019 PWC report “Künstliche Intelligenz in Unternehmen”, p. 12
  23. SOME STARTUP CHALLENGES • Data volume, accessibility, and quality •

    Company funding and runway • Expertise levels and team size
  24. WRAPPING UP Keep observing the market Look for matches between

    employers’ needs and your skills profile Scan the industry and startups for the most promising #aiusecase