Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Build Data Application with Dagster

Build Data Application with Dagster

Weera Kasetsin (Ball)
LINE Thailand Common Engineering Office Head
https://linedevday.linecorp.com/jp/2019/sessions/S1-04

Avatar for LINE DevDay 2019

LINE DevDay 2019

November 20, 2019
Tweet

More Decks by LINE DevDay 2019

Other Decks in Technology

Transcript

  1. 2019 DevDay Build Data Application With Dagster > Weera Kasetsin

    (Ball) > LINE Thailand Common Engineering Office Head
  2. I spent 20% of my time building my web/app, and

    80% of my time fighting the browser.
  3. I spent 80% of my time cleaning the data, and

    20% of my time doing my job.
  4. Data Cleaning • Rolling own custom infrastructure • Maintaining unreliable

    processes build atop untested software • Doing repetitive work that should not be necessary • And much more…
  5. What is data application? Data application is graph of functional

    computations that consume and produce data assets
  6. def split_cereals(context, cereals): if context.solid_config["process_hot"]: hot_cereals = DataFrame( [cereal for

    cereal in cereals if cereal["type"] == "H"] ) yield Output(hot_cereals, "hot_cereals") if context.solid_config["process_cold"]: hot_cereals = DataFrame( [cereal for cereal in cereals if cereal["type"] == "C"] ) yield Output(hot_cereals, "cold_cereals")
  7. @pipeline( mode_defs=[ ModeDefinition( name='unittest', resource_defs={'warehouse': local_sqlite_warehouse_resource}, ), ModeDefinition( name='dev', resource_defs={

    'warehouse': sqlachemy_postgres_warehouse_resource }, ), ] ) def modes_pipeline(): normalize_calories(read_csv())