Upgrade to Pro — share decks privately, control downloads, hide ads and more …

GINA_Data_Zoo.pdf

 GINA_Data_Zoo.pdf

This talk was presented to the CUGOS Fall fling event on Oct. 16th

Will Fisher

October 16, 2013
Tweet

Other Decks in Programming

Transcript

  1. The GINA Data Zoo Puffins, and lynx, and hamsters! Oh

    my... October 16th, 2013 Will Fisher Geographic Information Network of Alaska International Arctic Research Center University of Alaska Fairbanks Friday, October 18, 13
  2. • transfer data? • manage our processing systems? • let

    people consume data? How do we... Friday, October 18, 13
  3. We get data and figure out how to make it

    usable for the people that need it. Friday, October 18, 13
  4. Or to put it another way, we feed the animals

    Nom nom nom Friday, October 18, 13
  5. Source data ~7.6 times the area of Washington state Ortho

    tile data ~4.3 times the area of Washington state Friday, October 18, 13
  6. Satellite'Remote'Sensing' •  Statewide(Imagery( •  High(resolu4on( •  Historical(and(4me(series( Aerial'Remote'Sensing' •  LiDAR(–(eleva4on(

    •  Allows(hydro=modeling( •  Imagery(=(Historical( Water'Sensors' •  USGS(In=stream(Data(Loggers( In2situ'Sensors' •  MET(Sensors( Friday, October 18, 13
  7. waitp.perl •Based on waitd made by SeaSpace for Terascan •Simple,

    but... •Serial, no logging, no notifications Friday, October 18, 13
  8. The basics • Foreman • Blueprints • Workers • Tools

    • Packages • :shipit: Friday, October 18, 13
  9. Libraries • Celluloid - celluloid.io • Hamster - github.com/harukizaemon/ hamster

    • Listen - github.com/guard/listen • Thor - github.com/erikhuda/thor • StatsD - github.com/etsy/statsd/ Friday, October 18, 13
  10. Version 0.1 •Native FS events (Polling as a fallback) •Parallel

    processing •Logging/Notifications •http://github.com/gina-alaska/conveyor Friday, October 18, 13
  11. Version 2.0 • Concurrency • Metrics • More robust error

    handling / job tracking Friday, October 18, 13
  12. Down the road • Object store support (S3 compatible) •

    Message queues to trigger other processing jobs (Redis/0MQ?) Friday, October 18, 13
  13. Images • 12 Feeds o 5 Modis o 4 NPP

    o 2 Radar o 1 Webcam • ~3TB Archive • ~10GB/day Friday, October 18, 13
  14. Yo dawg i herd you like portals, so we put

    a portal in your portal, so you can search all the portals Friday, October 18, 13
  15. What is gLynx? • A single database of data records

    • But with multiple front ends (portals) • Separate search indexes for each portal • Each portal can manage its own look, data records and users Friday, October 18, 13
  16. Why? • Automatic sharing of data records • Shared collection

    of organizations and contact information • Giving the data owners control Friday, October 18, 13
  17. Yo dawg i herd you like portals, so we put

    a portal in your portal, so you can search all the portals Friday, October 18, 13