◮ Build light-curve catalogue ◮ Enable fast processing, and access (exploit database engine) The Schema Design ◮ Propagate algorithms to the data ◮ Optimise for comparison of latest measurements with a statistical model of all measurements ◮ Recently: redesign, renaming, explicit table relations, installing & upgrading Bart Scheers | TKP Meeting | 2012-12-04 LOFAR Databases
◮ Build light-curve catalogue ◮ Enable fast processing, and access (exploit database engine) The Schema Design ◮ Propagate algorithms to the data ◮ Optimise for comparison of latest measurements with a statistical model of all measurements ◮ Recently: redesign, renaming, explicit table relations, installing & upgrading The Content ◮ External catalogues: VLSS(r), WENSS, NVSS, exoplanets ◮ Standard frequency bands (as defined for MSSS) ◮ Original measurements ◮ Deduced data: associations between measurements, cataloguing measurements ◮ Meta-data: pipeline configuration and task settings Bart Scheers | TKP Meeting | 2012-12-04 LOFAR Databases
Raw data ∼ 25 TB/hr Here, we focus on the database ◮ Distinct sources: ∼ 107 − 108, ⊲ which are measured/revisited many, many, many times ◮ Single measurement stores ∼300B of data ◮ Overall data accumulation about 50 − 100 TB/yr ◮ Peaks may be over 10,000 source measurements per second Bart Scheers | TKP Meeting | 2012-12-04 LOFAR Databases
want to summarise/reduce our data statistically, instead of using all individual datapoints ◮ Therefore, we use a more database-friendly approach Avg xN = 1 N N i=1 xi ⇒ xN+1 = NxN +xN+1 N+1 Bart Scheers | TKP Meeting | 2012-12-04 LOFAR Databases
want to summarise/reduce our data statistically, instead of using all individual datapoints ◮ Therefore, we use a more database-friendly approach Avg xN = 1 N N i=1 xi ⇒ xN+1 = NxN +xN+1 N+1 w’d Avg ξN = PN i=1 wi xi PN i=1 wi ⇒ NξN +wN+1xN+1 NwN +wN+1xN+1 , wN+1 = 1/e2 N+1 Bart Scheers | TKP Meeting | 2012-12-04 LOFAR Databases
want to summarise/reduce our data statistically, instead of using all individual datapoints ◮ Therefore, we use a more database-friendly approach Avg xN = 1 N N i=1 xi ⇒ xN+1 = NxN +xN+1 N+1 w’d Avg ξN = PN i=1 wi xi PN i=1 wi ⇒ NξN +wN+1xN+1 NwN +wN+1xN+1 , wN+1 = 1/e2 N+1 Variability indices per band: Magnitude Vν = sν /Iν = 1 Iν N N−1 Iν 2 − Iν 2 Bart Scheers | TKP Meeting | 2012-12-04 LOFAR Databases
want to summarise/reduce our data statistically, instead of using all individual datapoints ◮ Therefore, we use a more database-friendly approach Avg xN = 1 N N i=1 xi ⇒ xN+1 = NxN +xN+1 N+1 w’d Avg ξN = PN i=1 wi xi PN i=1 wi ⇒ NξN +wN+1xN+1 NwN +wN+1xN+1 , wN+1 = 1/e2 N+1 Variability indices per band: Magnitude Vν = sν /Iν = 1 Iν N N−1 Iν 2 − Iν 2 Significance ην = N N−1 wνIν 2 − wν Iν 2 wν Bart Scheers | TKP Meeting | 2012-12-04 LOFAR Databases
want to summarise/reduce our data statistically, instead of using all individual datapoints ◮ Therefore, we use a more database-friendly approach Avg xN = 1 N N i=1 xi ⇒ xN+1 = NxN +xN+1 N+1 w’d Avg ξN = PN i=1 wi xi PN i=1 wi ⇒ NξN +wN+1xN+1 NwN +wN+1xN+1 , wN+1 = 1/e2 N+1 Variability indices per band: Magnitude Vν = sν /Iν = 1 Iν N N−1 Iν 2 − Iν 2 Significance ην = N N−1 wνIν 2 − wν Iν 2 wν ◮ Store factors for fast calculation ◮ http://docs.transientskp.org/tkp/database/schema.html Bart Scheers | TKP Meeting | 2012-12-04 LOFAR Databases
on position ◮ User-defined sources ◮ Picked up by the TraP ◮ Forced fits at locations by sourcefinder ◮ RMS upper limits if no source is found Bart Scheers | TKP Meeting | 2012-12-04 LOFAR Databases
light curves ◮ Use Variability Magnitude (Vν) and Significance (ην) indices ◮ Reduced χ2 probability justifies a rejection/acception of H0 (i.e. the source not being a variable) ⊲ p ην = ∞ ην ′=ην p ην (η ν ′, N − 1)dη ν ′ Bart Scheers | TKP Meeting | 2012-12-04 LOFAR Databases
Duration ◮ Peak flux ◮ Absolute and relative increase and decrease from background to peak flux, and the increase/decrease ratio Bart Scheers | TKP Meeting | 2012-12-04 LOFAR Databases
Load and Alter SQL Statements 10-1 100 101 102 103 104 Time [s] Load on single node Alter on single node Load over 9 nodes Alter over 9 nodes Load data; alter table add and update 4 DBL columns T1: 4.5 GB, row size 1023B, 4 Mrows T2: 85 GB, row size 467 B, 165 Mrows Bart Scheers | TKP Meeting | 2012-12-04 LOFAR Databases
Q3 Q4 Q5 Q6 Queries 10-3 10-2 10-1 100 101 102 Time [s] Cold Q on single node Hot Q on single node Cold mode: after server start, no in-memory data Hot mode: in-memory data Bart Scheers | TKP Meeting | 2012-12-04 LOFAR Databases
Q3 Q4 Q5 Q6 Queries 10-3 10-2 10-1 100 101 102 Time [s] Cold Q on single node Cold Q over 9 nodes Hot Q on single node Hot Q over 9 nodes Cold mode: after server start, no in-memory data Hot mode: in-memory data Bart Scheers | TKP Meeting | 2012-12-04 LOFAR Databases