Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Machine Learning - the only expertise you need

Marketing OGZ
September 20, 2022
120

Machine Learning - the only expertise you need

Marketing OGZ

September 20, 2022
Tweet

Transcript

  1. Modern Data Science practice: Machine Learning - the only expertise

    you need B i g D a t a E x p o 2 0 2 2 . U t r e c h t Alexey Chaplygin Chief Technology & Product Officer @ expondo GmbH
  2. • Chief Information Officer @ Reface AI • Data Science

    Manager @ PVH Europe • Software Developer / Data Science / Machine Learning Engineer @ Booking.com, Vrije Universiteit Amsterdam, ASML, SAP AG and others Alexey Chaplygin Chief Technology & Product Officer @ expondo GmbH
  3. • 120+mln EUR revenue​ • 400+ exponDOers • HQ in

    Berlin • Offices in Warshaw, Zielona Góra (PL), Shanghai and Hong Kong • Very remote friendly! Key facts: Procurement from 400+ partners in China, Vietnam, India and EU Product QA control Logistics to own warehouse in Poland Own digital production and marketing Sales via own web platform and marketplaces Own customer care and product aftercare
  4. Data Science vs Machine Learning • Neural Networks perform worse

    on small datasets • Neural Networks are smoothing functions • Neural Networks are being affected more by noisy inputs Findings: Conclusion: Stick to XGBoost and Random Forest Find good Machine Learning experts!
  5. Data Science vs Machine Learning Neural Networks perform worse on

    small datasets Medium size business: 10.000.000 EUR revenue, 100 EUR per customer gives 100.000 sales points per year. Number of impressions, touch-points and events generated by each customers is 100 times bigger. Neural Networks are smoothing functions Neural Networks are being affected more by noisy inputs If you don't know how to cook them and follow only the bookish approach.
  6. Practical Experiments Experiment #1 – fit the known function: using

    gradient decent find coefficients a, b, c, d Experiment #2 – find the unknown function: from random set X, consisting 256 points [0,1], knowing f(X) find the function g(x), that g(X) = f(X) Experiment #3 – find the unknown space of function: from random sets X consisting of n random points [0,1], where n is between 1 and 256, knowing f(X) find coefficients a, b, c, d of the function describing those points
  7. Experiment #1 – fit known Experiment #1 – fit the

    known function: using gradient decent find coefficients a, b, c, d To make it work: 1. Adam -> RMSProp 2. BatchSize -> 1 3. LearningRate -> gradually from 1 to .01 Error Space:
  8. Experiment #2 – fit unknown Experiment #2 – find the

    unknown function: from random set X, consisting 256 points [0,1], knowing f(X) find the function g(x), that g(X) = f(X) To make it work: 1. Adam -> RMSProp 2. BatchSize -> 1 3. LearningRate -> .001 Only interpolation!
  9. Experiment #3 – fit them all Experiment #3 – find

    the unknown space of function: from random sets X consisting of n random points [0,1], where n is between 1 and 256, knowing f(X) find coefficients a, b, c, d of the function describing those points To make it work – classic setup! Raw Data Feature Engineering Extracted Features Regression Model
  10. Dynamic Pricing The goal: For each product, each sales channel

    in each country find a function (price-demand elasticity), that depends on price and [all other data available], which output is sales density. Product Master vector Product Image matrix Sales History sequence of n vectors Marketing constant Classic stack Feature Engineering: 2FTE, SQL/Python (pandas) Modelling: 1FTE, Data Science Deployment: 1FTE, Python Engineering Total: 4FTE, 3 disciplines Machine Learning stack Modelling: 2FTE, Machine Learning Research Deployment: 1FTE, Machiner Learning Engineering Total: 3FTE, 1.5 disciplines
  11. Data Science vs Machine Learning Data Science: Prepare the raw

    data sources, from each data source manually extract a vector of features with the same key to join, build a Data Science Model using features as the input, deploy the model. Machine Learning: Prepare the raw data sources, build a Machine Learning Model, which automatically extracts vectors of features on its the shallow layers, and maps them onto the target space on its deep layers, deploy the model. Machine Learning is Data Science with automated feature engineering!
  12. Why Machine Learning as a core practice? Pros: • Narrow

    stack • Shared knowledge, less bus factor • Machine Learning specialists can usually do Data Science, but not the opposite • Machine Learning specialists are better coders than Data Scientists • Industry invests a lot in GPUs, TPUs, mobile "TensorCores" and other hardware accelerators for Machine Learning Cons: • Knowledge is scars, both in management and execution • Seniority required to keep the same speed and quality of developments and models interpretability
  13. T H A N K Y O U F O

    R Y O U R T I M E !