Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Machine Learning on Production

Machine Learning on Production

Machine Learning on Production

Eko Kurniawan Khannedy

March 18, 2016
Tweet

More Decks by Eko Kurniawan Khannedy

Other Decks in Technology

Transcript

  1. MACHINE LEARNING ON PRODUCTION EKO KURNIAWAN KHANNEDY ▸ Principal Software

    Development Engineer at blibli.com ▸ Part of Research and Development Team ▸ [email protected]
  2. HAL YANG PALING SULIT ITU ADALAH MEMBAWA MACHINE LEARNING KE

    PRODUCTION …. MACHINE LEARNING ON PRODUCTION
  3. MACHINE LEARNING ON PRODUCTION AGENDA ▸ The Hard Part ▸

    Best Practice ▸ Machine Learning in blibli.com
  4. MACHINE LEARNING ON PRODUCTION DATA ▸ Data Too Big ▸

    Unstructured Data ▸ Document Oriented and Master Detail Data ▸ Continuous Data ▸ Imbalance Data ▸ Wild Data
  5. MACHINE LEARNING ON PRODUCTION PREPROCESSING ▸ Feature Extraction ▸ Too

    Many Features Extraction Makes Process Too Long
  6. MACHINE LEARNING ON PRODUCTION DATA TOO BIG ▸ Load data

    to memory. ▸ Streaming the datasource. ▸ Split data into multiple nodes. ▸ Use memory-file database.
  7. MACHINE LEARNING ON PRODUCTION UNSTRUCTURED DATA ▸ Analyse Your Data

    ▸ Find Characteristic of Your Data ▸ Find Best Approachment for that case.
  8. MACHINE LEARNING ON PRODUCTION DOCUMENT ORIENTED AND MASTER DETAIL DATA

    ▸ Analyse Your Data ▸ Find the Best Way to Treat The Data
  9. MACHINE LEARNING ON PRODUCTION CONTINUOUS DATA ▸ Wide the range

    that use in normalization process. ▸ Consider it as a missing value.
  10. MACHINE LEARNING ON PRODUCTION WILD DATA ▸ Use Default Value.

    ▸ Use Average Value. ▸ Use Machine Learning to Predict Missing Value.
  11. MACHINE LEARNING ON PRODUCTION FEATURE EXTRACTION ▸ Add as Many

    Facts as Possible ▸ Remove Irrelevant Feature
  12. MACHINE LEARNING ON PRODUCTION TOO MANY FEATURES EXTRACTION MAKES PROCESS

    TOO LONG ▸ Use Non-Blocking Process ▸ Use Event Driven Process ▸ Use Parallel Process
  13. MACHINE LEARNING ON PRODUCTION FRAUD PREVENTION PLATFORM RESTFULL MASTER DATA

    CLIENT MACHINE LEARNING ENGINE PREPROCESSING ENGINE THIRD PARTY
 SERVICE
  14. MACHINE LEARNING ON PRODUCTION MACHINE LEARNING ENGINE RESTFULL METADATA
 DATA

    CLIENT TRAINING
 ENGINE TRAINING
 DATA CLASSIFICATION
 ENGINE