Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Transient Detection and SQL

transientskp
January 09, 2014

Transient Detection and SQL

Gijs Molenaar

transientskp

January 09, 2014
Tweet

More Decks by transientskp

Other Decks in Science

Transcript

  1. Images 10 bands (image cube) • 1 image cube per

    second • In future 10 images per second • 20 to 50 sources per image
  2. Data • Source (unique id) • Position (ra & decl)

    • Error (ra & decl) • Flux • Shape (2d gaussian) • Timestamp • Frequency (band)
  3. The data Unique id, ra, decl, errors, flux, shape, timestamp,

    frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency table
  4. Quite a lot of data • So quite a lot

    of calculations • move calculations to data (base) • Minimal communication
  5. Problems with SQL • Mix of language domains (python, SQL)

    • No SQL standard (various dialects) • Manual: • type checking / conversion • Error handling • schema migrations / instantiation
  6. Problems with monolithic queries • hard to read • hard

    to reason about • Hard to atomize • hard to test • Code reuse difficult • hard to maintain
  7. NO! • DB revisions, versioning hell • For every code

    change new version • Schema migrations • Init functions on startup? • DB specific
  8. Why? • Object Relational Mapper • Support multiple database dialect

    • Client side error checking (before query execution) • Less code • Better Query Builder Language SQLAlchemy -> SQL assembler -> C
  9. But also • • 3rd party applications! • Schema management

    • Creation and Up/down grading • Tools for creating migration paths
  10. Why not earlier? • Unforeseen problem at project start •

    No MonetDB dialect for SQLAlchemy • (I made one)
  11. Conclusion • SQLAlchemy is SQL powertool • I see no

    reason NOT to use SQLAlchemy • Will result in improved code readability and maintainability • (if properly used of course)