Upgrade to Pro — share decks privately, control downloads, hide ads and more …

apidays Paris 2024 - Build and scale an AI-prod...

apidays
December 22, 2024

apidays Paris 2024 - Build and scale an AI-product in a SaaS solution, Fanny Guélin and Arthur Delaitre, Mirakl

Build and scale an AI-product in a SaaS solution
Fanny Guélin, Senior Product Manager at Mirakl
Arthur Delaitre, Data Science Manager - AI Catalog at Mirakl

apidays Paris 2024 - The Future API Stack for Mass Innovation
December 3 - 5, 2024

------

Check out our conferences at https://www.apidays.global/

Do you want to sponsor or talk at one of our conferences?
https://apidays.typeform.com/to/ILJeAaV8

Learn more on APIscene, the global media made by the community for the community:
https://www.apiscene.io

Explore the API ecosystem with the API Landscape:
https://apilandscape.apiscene.io/

apidays

December 22, 2024
Tweet

More Decks by apidays

Other Decks in Programming

Transcript

  1. Confidential Mirakl 2024 Fan n y Guélin Sen ior Prod

    uct Man ager Catalog m an agem en t & seller on b oard in g Bu ild a n d s ca le a n AI- p ro d u ct in a Sa a S s o lu t io n Arth ur Delaitre Man ager Data Scien ce AI Catalog sp ecialist
  2. - Con fid en tial Mirak l 2024 An a

    ll-in -on e SaaS solu t ion t o m a n age m a rk et p la c e p la t form s Mira k l Created in 2012, 700+ em p loyees World lead er of SaaS Mark etp laces solution s +340 clien ts (m ark etp laces), +300k sellers
  3. - Con fid en tial Mirak l 2024 Ca t

    alog on b oard in g: a seller's n igh t m a re Ge n AI u se ca se Catalogu e ven d eu r 😭😭
  4. - Con fid en tial Mirak l 2024 Med ia

    n t im e t o crea t e a n offer Ge n AI u se ca se 28 days* User pain point + business opportunity 💸💸💸💸 💸💸💸💸 💸💸💸💸 💸💸💸💸 💸💸💸💸 💸💸💸💸 💸💸💸💸 💸💸 *On a Mirak l m ark etp lace in 2023
  5. - Con fid en tial Mirak l 2024 Ma p

    p in g “AI - assisted ” Ge n AI u se ca se Clarks 7.5 cm heel, ivory elegant heeled sandal that stands up to the demands of the day - premium sand uppers with knotted design details complement the foot perfectly. Clark s 7.5 cm h eel, ivory elegan t h eeled san d al th at stan d s up to th e d em an d s of th e d ay- p rem ium san d up p ers w ith k n otted d esign d etails com p lem en t th e foot p erfectly. Vendor description Short description Clark s Vendor brand Brand Clark s Vendor color Grey Ligh t Grey Colour Heel height Vendor material Leath er Composition 🤖🤖 🤖🤖 🤖🤖
  6. - Con fid en tial Mirak l 2024 Optimizing feat

    u res w it h AI Ge n AI u se ca se Map p in g No IA Median time to onboard a catalog 15 days 11 days 2020 2023 Map p in g AI - Assisted 🤖🤖 -30% Areas for improvement • Prod uct d ata com p letion • User exp erien ce • Map p in g d uration in ab solute value
  7. - Con fid en tial Mirak l 2024 Prod u

    ct im p ort “AI - powered ” Ge n AI u se ca se
  8. - Con fid en tial Mirak l 2023 Enrichment Adding

    external data to the seller’s source data. 2 Tran slation Autom ated tran slation in to th e op erator’s target lan guages. 3 Rew rite Rew ritin g titles & d escrip tion s, follow in g th e op erator’s d ata req uirem en ts & op tim izin g for SEO. 4 Tran sform ation & Extraction Tran sform ation of th e seller’s catalog form at to th e op erator’s, m axim izin g th e com p letion of in form ation . 1 Composition : 100% n atural leath er Clarks 7.5 cm heel, ivory elegant heeled sandal that stands up to the demands of the day - premium sand uppers with knotted design details complement the foot perfectly. Short description: Clark s 7.5 cm h eel, ivory elegan t h eeled san d al th at stan d s up to th e d em an d s of th e d ay- p rem ium san d up p ers w ith k n otted d esign d etails com p lem en t th e foot p erfectly. Brand: Clark s Colour: Cream y Vendor description Heel height : 3 in ch es Vendor material Leather Collection : Fall Win ter 2023 Short description: Prem ium san d leath er up p ers w ith k n otted d esign d etails com p lem en t th e foot p erfectly Short description FR: Clark s talon d e 7,5 cm , san d ale à talon élégan te ivoire q ui résiste aux exigen ces d u jour - d essus sab le h aut d e gam m e avec d étails d e con cep tion n oués com p lèten t p arfaitem en t le p ied . Ge n AI u se ca se 1 fea t u re: 4 Gen AI ap p licat ion s
  9. - Con fid en tial Mirak l 2024 Prod u

    ct im p ort Powered by AI (Mirakl Catalog Transformer) Reshaping feat u res w it h AI Ge n AI u se ca se Map p in g No IA Median time to onboard a catalog 15 jours 11 jours 2020 2023 Map p in g AI - Assisted 🤖🤖 -91% 24h ✨ 2024
  10. - Con fid en tial Mirak l 2024 Fr o

    m u s e r n e e d s t o s o lu t io n When potential meets technical challenges.
  11. - Con fid en tial Mirak l 2024 Ch a

    lle n g e # 1 Ar c h it e c t u r e Scale, control costs, and maintain modularity.
  12. - Con fid en tial Mirak l 2024 Using LLMs

    for catalog onboarding Does it scale? Costs ? Volume ?
  13. - Con fid en tial Mirak l 2024 10M Volume

    Hypothesis Prod ucts created each m on th ~ 70B token s Input Output ~ 25B tok en s ~ $0.5M/month For target volumes: Small & Medium LLMs Wh a t LLMs can w e u se rega rd in g cost s? Ch a lle n ge # 1 Arch it e ct u re + GPT4 ~ $15k/month 8B Costs depending on the model
  14. - Con fid en tial Mirak l 2024 First version

    t o t est cu st om er fit in a Bet a version Seller Catalog Model 1 100% of volu m e * Exem p le, m od els can ch an ge GPT4 * ⛔500k€ / m on th at scale 🚨🚨Perform an ce Laten cy Hallucin ation s Con sisten cy Custom ization User Interface Ch a lle n ge # 1 Arch it e ct u re
  15. - Con fid en tial Mirak l 2024 🚨🚨Need a

    train in g d ata set close to real d ata Need groun d truth Hard er to set up Secon d version w it h fin e t u n ed LLMs for Gen eral Availab ilit y Seller Catalog Model 1 User Interface Ch a lle n ge # 1 Arch it e ct u re * Exem p le, m od els can ch an ge 100% of volu m e
  16. - Con fid en tial Mirak l 2024 Sm aller

    m od els less effectively on "exotic" cases (ie far from w h at is in clud ed in th e train in g set) Goal : Use th e sm allest m od el th at d elivers accep tab le p erform an ce for a given in p u t. Catalog ven d eur Seller Catalog Router Mod el 1 Mod el 2 Mod el 3 Ch atGPT* LLaMa 3.1 70B* Mistral NeMo* >85% of volu m e ~ 10 % of volu m e >5% of volu m e >Layering Fin e t u n ed LLM: Da t a issu e ou t sid e of d ist rib u t ion Ch a lle n ge # 1 Arch it e ct u re * Exem p le, m od els can ch an ge
  17. - Con fid en tial Mirak l 2024 St ep

    b a ck : Ot h er d esign req u ire m en t s Ch a lle n ge # 1 Arch it e ct u re Sequen tial op eration of m ultip le m od els: n eed for an adaptable architecture Large seller files: en ab le horizontal parallelization Cost op tim ization : shared GPU inference & LLM across sp ecific en d p oin ts
  18. - Con fid en tial Mirak l 2024 Mod el

    1 Seller Catalog Mirakl Catalog Transformer Mod el 2 Mod el N Mod el registry GPU Serving endpoint Heavy com p u t ation s b eh in d APIs Parallelized in Sp ark (for large catalogs) LLM Hosting endpoint Focu s: Target arch it ect u re in services Ch a lle n ge # 1 Arch it e ct u re Pub lic APIs Private APIs
  19. - Con fid en tial Mirak l 2024 Ch a

    lle n g e # 2 P e r fo r m a n c e Build a training dataset and evaluate outputs
  20. - Con fid en tial Mirak l 2024 Prod uct

    d ata (in p ut) Exp ected resp on se Préd iction s Train in g Prod uct d ata Offlin e In feren ce 20 Inference avec LLM finetune How t o fin et u n e a LLM on a sp ecific t ask ? Ch a lle n ge # 2 Pe rform a n ce ? Use a b igger m od el for lab ellisation ? Bu t h ow to en su re q u ality
  21. - Con fid en tial Mirak l 2024 En su

    re q u a lit y a n d coh eren ce (e.g. lim it h a llu cin a t io n s) 1st step : - Prom p t en gin eerin g - LLM as a jud ge 2nd step : Galileo Ch a lle n ge # 2 Pe rform a n ce 3rd step : Human in the loop
  22. - Con fid en tial Mirak l 2024 Tra in

    in g p ip elin e Ch a lle n ge # 2 Pe rform a n ce Prom p t creation Pseud o groun d tru th Dataset cu ration Train in g Evaluation In féren ce
  23. - Con fid en tial Mirak l 2024 Key learnings

    LLMs a re in h e re n t ly p ro n e t o h a llu c in a t io n s Sig n ific a n t e ffo rt is re q u ire d t o m o ve fro m a POC t o a p ro d u c t t h a t s c a le s Pla n fo r c o s t s e a rly in t h e p ro je c t a n d h a ve a n a c t io n a b le p la n in p la c e An t ic ip a t e t h e ra p id d e ve lo p m e n t o f LLM o ffe rin g s : s t a y a g ile a n d p la n fo r d e c re a s in g c o s t s Th e u s e o f LLMs is a "g a m e -ch a n g e r" fo r t h e u s e r e x p e rie n ce