Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Sharded Light-curve Database

Sharded Light-curve Database

Bart Scheers
LOFAR Transients Key Project Meeting, Meudon, December 2011

transientskp

June 23, 2012
Tweet

More Decks by transientskp

Other Decks in Science

Transcript

  1. Sharded Light-curve Database Bart Scheers Astronomical Institute Anton Pannekoek, University

    of Amsterdam Centre for Mathematics & Informatics (CWI) LOFAR TKP Meeting, Meudon, 2011–12–14
  2. 1 LOFAR TKP Meeting – 2011-12-14 Bart Scheers LOFAR Catalogue

    (of Light Curves) ▸ List of all sources detected at least once by LOFAR ▸ Multiple observations per source ▹ Light curves ▹ Adding time domain → dynamic catalogue ▸ Keep track of 'meta-data' ▹ image properties (noise) ▹ observation characteristics ▸ Make available for data mining/discovery ▹ Scalable
  3. What do we expect? ▸ Full operation: 50 – 100

    TB/yr ▸ Peaks: 10,000 sources per second ▸ Distinct sources: ~107 – 108 ▹ which are revisted many, many, many times ▸ These numbers call for bulk-processing ▹ maintain statistical representations of data ▹ spread data over multiple nodes 2 LOFAR TKP Meeting – 2011-12-14 Bart Scheers
  4. Source Association 3 LOFAR TKP Meeting – 2011-12-14 Bart Scheers

    Dynamic (updated after every image) Static (but updated after every db instantiation)
  5. TKP Data(base) flow 4 LOFAR TKP Meeting – 2011-12-14 Bart

    Scheers Light-curve Database, Long-term Archive, 50 – 100 TB/yr TRAP Database, during observations, ≲ 500 GB
  6. Source Association 6 LOFAR TKP Meeting – 2011-12-14 Bart Scheers

    (1) distance on sky (2) dimensionless distance (3) likelyhood ratio
  7. Source Association 6 LOFAR TKP Meeting – 2011-12-14 Bart Scheers

    (1) distance on sky (2) dimensionless distance (3) likelyhood ratio
  8. Source Variability 7 LOFAR TKP Meeting – 2011-12-14 Bart Scheers

    (1) absolute flux change (2) weighted flux change Maintain 6 properties
  9. Association & Variability Probabilities ▸ Rayleigh distribution ▸ η ν

    behaves as chi square probability 8 LOFAR TKP Meeting – 2011-12-14 Bart Scheers
  10. 11 LOFAR TKP Meeting – 2011-12-14 Bart Scheers Global Sky

    Model ▸ Get all VLSS sources within the field of view ▸ Find (none or) counterpart in WENSS and NVSS catalogues ▸ Fit spectral index, curvature and higher order curvature order terms ▸ Create source-list file ▸ Wanted: No VLSS in FoV, use WENSS as base
  11. 12 LOFAR TKP Meeting – 2011-12-14 Bart Scheers Global Sky

    Model gsm.expected_fluxes_in_fov(conn, ra_c, decl_c, fov_radius, assoc_theta, 'bbs.skymodel.test', storespectraplots=True)
  12. We want to mine more... ▸ Detecting trends ▹ n

    sequential data points mσ above average ▸ Systematic structure of light curve ▹ Ratio of the mean square successive difference to the sample variance ▸ FTs, cross- & auto-correlations; all work with (varying) window sizes → SciQL 14 LOFAR TKP Meeting – 2011-12-14 Bart Scheers
  13. SciQL – Cross-Correlation Example ▸ Extend SQL2003 → SciQL 15

    LOFAR TKP Meeting – 2011-12-14 Bart Scheers
  14. SciLens Platform (original) 16 LOFAR TKP Meeting – 2011-12-14 Bart

    Scheers ▸ Computational top tier (1 node) ▸ High-end tier (16 nodes) ▸ Cloud-oriented tier (64 nodes) ▸ Energy-conservative tier (256 nodes)
  15. SciLens Platform (current) 17 LOFAR TKP Meeting – 2011-12-14 Bart

    Scheers ▸ Top and high-end tier not built yet ▸ Cloud-oriented tier ▹ 144 Rocks, single quad cores, 16 GB, 0.5 or 1 TB SSD, 1 × 2 TB, Infiniband (40Gb/s) ▸ Bottom tier ▹ 144 Pebbles AMD- Bobcat, 8GB, 5 × 2TB, ethernet (1Gb/s)
  16. SciLens Platform (current) 17 LOFAR TKP Meeting – 2011-12-14 Bart

    Scheers ▸ Top and high-end tier not built yet ▸ Cloud-oriented tier ▹ 144 Rocks, single quad cores, 16 GB, 0.5 or 1 TB SSD, 1 × 2 TB, Infiniband (40Gb/s) ▸ Bottom tier ▹ 144 Pebbles AMD- Bobcat, 8GB, 5 × 2TB, ethernet (1Gb/s)
  17. SciLens Platform (current) 17 LOFAR TKP Meeting – 2011-12-14 Bart

    Scheers ▸ Top and high-end tier not built yet ▸ Cloud-oriented tier ▹ 144 Rocks, single quad cores, 16 GB, 0.5 or 1 TB SSD, 1 × 2 TB, Infiniband (40Gb/s) ▸ Bottom tier ▹ 144 Pebbles AMD- Bobcat, 8GB, 5 × 2TB, ethernet (1Gb/s)
  18. SciLens Platform (current) 17 LOFAR TKP Meeting – 2011-12-14 Bart

    Scheers ▸ Top and high-end tier not built yet ▸ Cloud-oriented tier ▹ 144 Rocks, single quad cores, 16 GB, 0.5 or 1 TB SSD, 1 × 2 TB, Infiniband (40Gb/s) ▸ Bottom tier ▹ 144 Pebbles AMD- Bobcat, 8GB, 5 × 2TB, ethernet (1Gb/s)
  19. 18 LOFAR TKP Meeting – 2011-12-14 Bart Scheers Conclusions ▸

    Statistical representation of full LOFAR catalogue relaxes source association ▸ Sharded database reduces replication ▸ Together with SciLens the infrastructure is scalable ▸ SciQL extends data mining opertunities