Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Transient detection with SQLAlchemy

Transient detection with SQLAlchemy

About how we use SQLAlchemy as a domain specific language to build maintainable scientific SQL queries.

Gijs Molenaar

May 09, 2014
Tweet

More Decks by Gijs Molenaar

Other Decks in Science

Transcript

  1. Me • Gijs Molenaar • Scientific Software Engineer • UvA

    - Anton Pannekoek Instituut • SKA - Square Kilometre Array (South Africa) ! • http://pythonic.nl
  2. Data rate • Data cubes • Multiple frequencies • 500

    - 5000 sources /s • Will grow in future
  3. SQL • Storing, manipulating and retrieving data • De facto

    standard • Structured data • Relational data • Old language
  4. insert into id, ra, decl, errors, flux, shape, timestamp, frequency

    id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency id, ra, decl, errors, flux, shape, timestamp, frequency Table
  5. A lot of data • So quite a lot of

    calculations • move calculations to data • Minimize communication
  6. SQL problems • Mix of language domains (python, SQL) •

    No SQL standard (various dialects) • Manual: • type checking / conversion • Error handling • schema migrations / instantiation
  7. monolithic queries • hard to read • hard to reason

    about • Hard to atomise • hard to test • Code reuse difficult • hard to maintain
  8. no! • DB revisions, versioning hell • For every code

    change new version • Schema migrations • DB specific
  9. Why? • Object Relational Mapper • Support multiple database dialect

    • Better Query Builder Language! • SQLAlchemy -> SQL C -> assembler
  10. but also • 3rd party applications • Schema management •

    profiling • Backwards compatible, raw queries
  11. Conclusions • SQLAlchemy is SQL power tool • No reason

    NOT to use SQLAlchemy • improved code readability and maintainability • (if properly used of course)