Transient Detection and SQL

Ab44292d7d6f032baf342a98230a6654?s=47 transientskp
January 09, 2014

Transient Detection and SQL

Gijs Molenaar

Ab44292d7d6f032baf342a98230a6654?s=128

transientskp

January 09, 2014
Tweet

Transcript

  1. 1.

    Transient Detection and SQL LOFAR Transients Key Science Project meeting

    January 9 2014 Gijs Molenaar @gijzelaerr gijs@pythonic.nl
  2. 3.

    Images 10 bands (image cube) • 1 image cube per

    second • In future 10 images per second • 20 to 50 sources per image
  3. 4.

    Data • Source (unique id) • Position (ra & decl)

    • Error (ra & decl) • Flux • Shape (2d gaussian) • Timestamp • Frequency (band)
  4. 8.

    The data Unique id, ra, decl, errors, flux, shape, timestamp,

    frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency table
  5. 9.
  6. 10.
  7. 11.
  8. 12.
  9. 13.
  10. 14.
  11. 15.

    Quite a lot of data • So quite a lot

    of calculations • move calculations to data (base) • Minimal communication
  12. 16.

    Problems with SQL • Mix of language domains (python, SQL)

    • No SQL standard (various dialects) • Manual: • type checking / conversion • Error handling • schema migrations / instantiation
  13. 17.

    Problems with monolithic queries • hard to read • hard

    to reason about • Hard to atomize • hard to test • Code reuse difficult • hard to maintain
  14. 18.
  15. 20.

    NO! • DB revisions, versioning hell • For every code

    change new version • Schema migrations • Init functions on startup? • DB specific
  16. 21.
  17. 22.

    Why? • Object Relational Mapper • Support multiple database dialect

    • Client side error checking (before query execution) • Less code • Better Query Builder Language SQLAlchemy -> SQL assembler -> C
  18. 23.

    But also • • 3rd party applications! • Schema management

    • Creation and Up/down grading • Tools for creating migration paths
  19. 25.
  20. 26.
  21. 27.
  22. 28.

    Why not earlier? • Unforeseen problem at project start •

    No MonetDB dialect for SQLAlchemy • (I made one)
  23. 30.
  24. 31.

    Conclusion • SQLAlchemy is SQL powertool • I see no

    reason NOT to use SQLAlchemy • Will result in improved code readability and maintainability • (if properly used of course)