Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building an efficient data team

Building an efficient data team

My talk on Developer Relations Munich meetup
https://www.meetup.com/Dev-Rel-Munich/events/267915934/
Forming an efficient data team, setting up a good environment for productive collaboration, and defining the right expectations from different data roles. We will discuss all stages from hiring to getting feedbacks on performance/post-implementation reviews.

Alex Tselikov

January 27, 2020
Tweet

More Decks by Alex Tselikov

Other Decks in Technology

Transcript

  1. About me · Data Science, Machine Learning, Data Engineering, 7

    years · Software Engineering in a product company, 5 years · Engineering management, employee growing, 1.5 years · Internal & External Data hiring, 2 years · Bringing data products from prototyping to production · Product development using Agile principles · Running Machine Learning at a large-scale · Open source contributor, kaggle expert
  2. Less than 15% of business say their organization’s culture supports

    data-driven decision-making *Data-Driven Mindset Report (2019) Become a data-driven ...
  3. Ok, let’s just hire a couple of Data Scientist -

    How about we hire a senior DS first and then let him build a team?
  4. - How about we hire a senior DS first and

    then let him build a team? - How about juniors? 100 applications - with online courses, pet projects, etc. … Ok, let’s just hire a couple of Data Scientist
  5. We have a lot of data but it is in

    ... - In 1000s csv files - Available in streaming API - SAP - Oracle / DB2 - On my USB in matlab format ...
  6. We have a lot of data but it is in

    ... Ok, google, from jupyter.notebook import what? - In 1000s csv files - Available in streaming API - SAP - Oracle / DB2 - On my USB in matlab format ...
  7. Data engineering? What is that? • Ingestion • Buffering •

    Processing • Storage • Serving • Governance https://www.dataquest.io/blog/what-is-a-data-engineer/
  8. Data engineering skills … carry out data engineering and AI

    infrastructure tasks. Companies may refer to this position as data engineer, software engineer, software development engineer, software engineer-AI Infrastructure, software engineer-data, etc.
  9. Data engineering VS Data Science skills Bad news: DS job

    applications are 10 times more comparing to DE
  10. A data PM! 1. Data Product planning, designing and output

    with the product development plan and the overall product delivery with tech team. 2. Find the business chance and deliver the overall data product solution after communication with crossing business team on their requirements and pain points actively. 3. Work closely with product operation team to design the product commercial mode solution and deliver the product operation plan for some key customers. 4. Strong data sensitivity and able to monitor the product performance data, analyze it and keep consistent optimization on the products to get friendly and better user experience.
  11. Ok, but ... … as a DS I have to

    know - Math - Statistic - Algorithms - Business domains - Communication - Presentation
  12. Ok, but ... … as a DS I have to

    know - Math - Statistic - Algorithms - Business domains - Communication - Presentation That’s why my code could be crappy, right?
  13. Ok, but ... No! Not any more! … as a

    DS I have to know - Math - Statistic - Algorithms - Business domains - Communication - Presentation That’s why my code could be crappy, right?
  14. DS should also follow common best practices • pep8, black/autopep

    formatting, pylint, etc. • git, CI/CD, git actions • code reviews and proper git flow • pre-commit hooks • proper documentation
  15. DS should also follow common best practices • pep8, black/autopep

    formatting, pylint, etc. • git, CI/CD, git actions • code reviews and proper git flow • pre-commit hooks • proper documentation • unit tests • limit jupyter notebook usage • version control (tools like DVC) • types annotation (mypy) • separate venv for each project
  16. Team routines Team activities: • Learning/Mentoring • 1-on-1s • Demo

    (bi-) weekly sessions (cross-project) • OKRs Cross-teams learning: • Team routines used for mentoring (e.g. code walkthroughs, mob reviews, refactoring exercises, best practices sharing) • Internal company talks • Blog posts, meetups/conference participation
  17. Performance reviews Competency Rating (1 to 4, 4 is the

    best) Understanding Engineering Concepts 1 Creating Well-crafted Software 1 Mastering Technology and Learning 3 Handling Complexity and Uncertainty 2 Making Constant Progress 3 Prioritizing and Owning Tasks 2 Communicating Effectively 3