Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Science Data Management @ NOAO

Science Data Management @ NOAO

LCOGT February 25, 2014

Joshua Hoblitt

February 25, 2014
Tweet

More Decks by Joshua Hoblitt

Other Decks in Technology

Transcript

  1. SDM's role Essentially, we “store, cook, and serve” “the data”.

    – Data “hand off” / Mountain top caches – Data transport – Data storage / Archiving (indexing - limited) – Data reduction (not all instruments) – Portal access Via the Portal(s), we also serve as part of the public interface between NOAO and the community at large.
  2. Scale • 5 sites – KP, TU, CT, CP, LS

    • ~10 telescopes* • “3 copy” rule – Applies to raw data only – Effectively have 4 copies • 3 “mass store systems” – 1 x 524T (metdata & data replication) – 2 x 175T • 3 pipelines – deccp, newfirm, mosaic • 2 public archives – raw/reduced – survey
  3. FITS compression NOAO Science Data Report for 2014­02­23, R. Seaman

    replication files fz(MB) R(avg) uncompressed NORTH 351 4843.5 2.95 14306.8 MB SOUTH 1379 133411.5 2.40 320406.5 MB total 1730 138255.0 2.42 334713.3 MB data type files fz(MB) R(avg) uncompressed 16­bits 1168 134222.2 2.40 322222.5 MB 32­bits 259 3590.6 3.14 11274.8 MB FP (raw) 303 442.2 2.75 1216.0 MB observing mode files fz(MB) R(avg) uncompressed classical 990 135199.5 2.43 328023.4 MB queue 740 3055.5 2.19 6689.9 MB eng ­ ­ ­ ­ split ­ ­ ­ ­ too ­ ­ ­ ­ survey ­ ­ ­ ­
  4. Checksums • Network – 1's compliment is insensitive to byte

    swapping – FITS uses a 1's compliment checksum – TCP uses a 1's compliment checksum – Most Layer 2 transports use a more robust CRC • But these are recomputed for every l2 segment • Storage – Generally no end to end check summing* – “silent corruption” / non-repeatable reads – Array rebuild errors