Slide 1

Slide 1 text

Transient Detection and SQL LOFAR Transients Key Science Project meeting January 9 2014 Gijs Molenaar @gijzelaerr [email protected]

Slide 2

Slide 2 text

TRAP • TRAnsient Detection Pipeline • Detect Transients in AARTFAAC data • 24/7 full sky monitor

Slide 3

Slide 3 text

Images 10 bands (image cube) • 1 image cube per second • In future 10 images per second • 20 to 50 sources per image

Slide 4

Slide 4 text

Data • Source (unique id) • Position (ra & decl) • Error (ra & decl) • Flux • Shape (2d gaussian) • Timestamp • Frequency (band)

Slide 5

Slide 5 text

Data rate • 500 - 5000 inserts per second • May get more in future

Slide 6

Slide 6 text

Calculations • Associate in frequency • Associate in time • Transients detection

Slide 7

Slide 7 text

SQL Storing, manipulating and retrieving data De facto standard Structured data Relational data Old language

Slide 8

Slide 8 text

The data Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency Unique id, ra, decl, errors, flux, shape, timestamp, frequency table

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

Quite a lot of data • So quite a lot of calculations • move calculations to data (base) • Minimal communication

Slide 16

Slide 16 text

Problems with SQL ● Mix of language domains (python, SQL) ● No SQL standard (various dialects) ● Manual: ● type checking / conversion ● Error handling ● schema migrations / instantiation

Slide 17

Slide 17 text

Problems with monolithic queries • hard to read • hard to reason about • Hard to atomize • hard to test • Code reuse difficult • hard to maintain

Slide 18

Slide 18 text

Example

Slide 19

Slide 19 text

Solution? • Server side functions?

Slide 20

Slide 20 text

NO! • DB revisions, versioning hell • For every code change new version • Schema migrations • Init functions on startup? • DB specific

Slide 21

Slide 21 text

SOLUTION

Slide 22

Slide 22 text

Why? • Object Relational Mapper • Support multiple database dialect • Client side error checking (before query execution) • Less code • Better Query Builder Language SQLAlchemy -> SQL assembler -> C

Slide 23

Slide 23 text

But also ● ● 3rd party applications! ● Schema management ● Creation and Up/down grading ● Tools for creating migration paths

Slide 24

Slide 24 text

And last but not least ● Backwards compatible! ● Can still do raw queries

Slide 25

Slide 25 text

example

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

No content

Slide 28

Slide 28 text

Why not earlier? • Unforeseen problem at project start • No MonetDB dialect for SQLAlchemy • (I made one)

Slide 29

Slide 29 text

sqlalchemy-monetdb • pip install sqlalchemy-monetdb • https://github.com/gijzelaerr/sqlalchemy-monetdb • https://github.com/gijzelaerr/djonet • (banana)

Slide 30

Slide 30 text

Possible Disadvantage A bit harder to see the actual SQL BUT SQLALchemy very similar to SQL

Slide 31

Slide 31 text

Conclusion ● SQLAlchemy is SQL powertool ● I see no reason NOT to use SQLAlchemy ● Will result in improved code readability and maintainability ● (if properly used of course)

Slide 32

Slide 32 text

Questions?