Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ESASky: New window to the universe

ESASky: New window to the universe

ESA Sky is the science-driven discovery portal developed by engineers and scientists working at the European Space Agency to allow users to explore the multi-wavelength sky and to seamlessly retrieve science-ready data in all ESA Astronomy mission archives from a web application without prior-knowledge of any of the missions. The first public beta of the service has been released, future releases will enable retrieval of spectra and will have special time-domain exploration features.
PostgreSQL is the database system chosen to store the data connecting with other PostgreSQL databases in a performant way. The use of materialized views created on top of foreign tables has helped to get a fast system facilitating the integration with other archives, as well as the use of pgSphere to filter data using polygons and healpix functions. The possibility to integrate more data from other Astronomy centers is a reality which will lead us for new challenges regarding integration, performance, replication and scalability.
We would like to show how ESA Sky works and how we have developed an excellent door to the universe using PostgreSQL.
Check here: http://archives.esac.esa.int/esasky-beta/

Pilar de Teodoro

April 26, 2016
Tweet

More Decks by Pilar de Teodoro

Other Decks in Technology

Transcript

  1. Issue/Revision: 1.0 Reference: PGCon’US 16 Status: Issued ESA UNCLASSIFIED -

    Releasable to the Public ESASky: New Window to the Universe Pilar de Teodoro on behalf of ESDC team 04/19/2016
  2. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 2 ESA UNCLASSIFIED - Releasable to the Public European Space Agency
  3. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 3 ESA UNCLASSIFIED - Releasable to the Public Some physics Different Wavelengths: Image credits: Radio:WSRT/R. Braun; Infrared:NASA/Spitzer/K. Gordon; Visible: Robert Gendler; Ultraviolet: NASA/GALEX; X-ray: ESA/XMM/W. Pietsch
  4. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 4 ESA UNCLASSIFIED - Releasable to the Public ESA Satellites
  5. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 5 ESA UNCLASSIFIED - Releasable to the Public ESASky Concept Goal for Beta release: •  Multi-wavelength exploration •  Single and multi- target search Data Ingredients list: •  All-Sky maps (aka HiPS) •  Individual ESA archives real data and footprints •  Catalogues
  6. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 6 ESA UNCLASSIFIED - Releasable to the Public Ø  Data contents of beta release: •  13 years of INTEGRAL data, 2077 gamma-ray sources •  16 years of XMM-Newton data, 8721 observations, 565962 X-ray and 6 million optical/UV sources •  25 years of HST data, 588820 observations, 29 million optical sources •  4 years of Hipparcos data, 2.5 million optical sources •  2.3 year of ISO data, 47652 observations •  4 years of Herschel data, 16039 observations •  4 years of Planck data, 9 all-sky maps, 153142 radio sources ESA Sky beta In total more than 29.000.000 sources and 1 million observations!!!
  7. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 7 ESA UNCLASSIFIED - Releasable to the Public DEMO or VIDEO DEMO
  8. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 8 ESA UNCLASSIFIED - Releasable to the Public
  9. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 9 ESA UNCLASSIFIED - Releasable to the Public Focusing on data: HiPS HiPS: Hierarchical Progressive Survey (CDS) HEALPix sky tessellation (Hierarchical Equal Area isoLatitude Pixelation) [ http://healpix.sourceforge.net ] HiPS is now IVOA note [ http://ivoa.net/documents/Notes/HiPS/index.html ]
  10. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 10 ESA UNCLASSIFIED - Releasable to the Public HiPS: Hierarchical Progressive Survey HiPS Don’t Lie! CCD borders Duplicated sources Observations with different calibration or exposure time
  11. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 11 ESA UNCLASSIFIED - Releasable to the Public Footprints in numbers Observation type #footprints (#observation) #polygons per observation #total polygons #total points XMM-Newton 8.375 From 2 up to 13 20.041 192.831 XMM-OM(UV) 9.211 2 18.422 38.462 XMM-OM(UVB) 3.379 2 6.758 76.925 HST 345.519 From 2 up to 10 777.376 2.361.085 ISO 6.898 From 2 up to 9 17.104 50.198 Herschel 23.172 2 46.344 1.076.667
  12. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 12 ESA UNCLASSIFIED - Releasable to the Public Catalogues Catalogue #sources INTEGRAL 2.077 XMM Slew 41.423 3XMM EPIC 333.481 XMM OM 3.492.082 Tycho-2 2.539.913 HSC 29.164.309 PSZ 1.653 PGSS 13.242 PCSS 138.247 Effective display of catalogues: order and limit
  13. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 13 ESA UNCLASSIFIED - Releasable to the Public Histograms
  14. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 14 ESA UNCLASSIFIED - Releasable to the Public Yes! We use PostgreSQL!
  15. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 15 ESA UNCLASSIFIED - Releasable to the Public Hardware architecture ESA Sky Science archives at ESAC ESA Hubble Science Archive Herschel Science Archive XMM-Newton Science Archive Frontend Backend Tap server HTTP load balancer DB DATA HiPS ISO Science Archive
  16. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 16 ESA UNCLASSIFIED - Releasable to the Public Frontend Running on a Web Browser (HTML5/CSS3) Google Web Toolkit •  Aladin Lite wrapper (JSNI) •  Data Visualization (Highcharts) Usage of IVOA Protocols •  TAP accessing archive metadata •  STC-s describing complex FoVs Astronomical services access (Simbad) •  Target coordinates resolver •  Angular size resolver
  17. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 17 ESA UNCLASSIFIED - Releasable to the Public Backend Data Access Apache HTTP Server •  Serves HiPS requests Java Servlet container (Tomcat) •  Serves TAP & Target Resolver requests •  Data and metadata download request Database •  PostgreSQL DB (FDW and MV) •  Spherical data types library (PgSphere)+q3c +PostGIS for ephemerides •  Footprints -> Spherical data types Usage of IVOA Protocols & Standards •  TAP requests •  ADQL translation to SQL + PgSphere •  Storage of STC-S footprint information
  18. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 18 ESA UNCLASSIFIED - Releasable to the Public PostgreSQL nodes: Main features Number of nodes: OPS 1 RAM 512GB Storage 2.9TB SSD Operating System RHEL 6 PostgreSQL 9.4.3 Number of nodes: DEV 1 RAM 8GB Storage 100GB SAS Operating System RHEL 6 PostgreSQL 9.4.3
  19. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 19 ESA UNCLASSIFIED - Releasable to the Public PostgreSQL extensions Spherical Indexing SW on PostgreSQL : Q3C 2006, Koposov, S.; Bartunov, O. – Quad Tree pixelization, B-Tree indexing PgSphere • 2008, Bartunov, O.; Sigaev, T.; Richter, J.; Chilingarian, I. – GIST indexing – Support for more spherical objects pg_healpix 2012, Koposov, S. -count strategy
  20. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 20 ESA UNCLASSIFIED - Releasable to the Public Performance Optimization
  21. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 21 ESA UNCLASSIFIED - Releasable to the Public FYI Ø  Expecting 1000 concurrent users navigating and 100 users downloading data Ø  No login Ø  Access quality depends on network Ø  Tested in Australia and very slow Ø  Possibility to replicate HIPS in different countries (CDN)
  22. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 22 ESA UNCLASSIFIED - Releasable to the Public 1-Histograms: count strategy Density maps
  23. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 23 ESA UNCLASSIFIED - Releasable to the Public Count strategy 1- Create the healpix id SELECT healpix_ang2ipix_nest(256::bigint, hst_observation_fdw.ra_deg::double precision, hst_observation_fdw.dec_deg::double precision) AS npix,….. 2- Create the index on the healpix id CREATE INDEX mv_hst_observation_fdw_npix ON public.mv_hst_observation_fdw USING btree (npix); 3- populate the table for querying the counts EXECUTE format('with aux_count as (select npix>>'||shift||' as npix,count(npix) as count from %s group by npix>>'||shift||') update count_table_h'||nside||' u set %s = count from aux_count where u.npix=aux_count.npix', table_to_count, count_column);
  24. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 24 ESA UNCLASSIFIED - Releasable to the Public 2- PostgreSQL tuning How to make faster queries: 1.  Access to different catalogues: •  Postgresql Foreign Data Wrappers •  Materialized views •  Indices on materialized views CREATE MATERIALIZED VIEW public.mv_v_gcc_catalog_fdw AS SELECT v_gcc_catalog_fdw.name, v_gcc_catalog_fdw.ra, v_gcc_catalog_fdw."dec", v_gcc_catalog_fdw.snr, v_gcc_catalog_fdw.gau_major_axis, v_gcc_catalog_fdw.gau_major_axis_sig, v_gcc_catalog_fdw.gau_minor_axis, v_gcc_catalog_fdw.gau_minor_axis_sig, spoint(v_gcc_catalog_fdw.ra * pi() / 180.0::double precision, v_gcc_catalog_fdw."dec" * pi() / 180.0::double precision) AS pos, healpix_ang2ipix_nest(256::bigint, v_gcc_catalog_fdw.ra, v_gcc_catalog_fdw."dec") AS npix FROM v_gcc_catalog_fdw WITH DATA; CREATE INDEX gcc_pos_idx ON public.mv_v_gcc_catalog_fdw USING gist (pos); CREATE INDEX gcc_npix_idx ON public.mv_v_gcc_catalog_fdw USING btree (npix);
  25. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 25 ESA UNCLASSIFIED - Releasable to the Public PostgreSQL: some queries to resolve 1.  Typical queries: [2016-02-19 11:06:56 CET-ammiapp01.n1data.lan(45409)]LOG: duration: 125747.477 ms execute <unnamed>: SELECT COUNT(public.mv_hubble_sc_fdw."pos") AS COUNT FROM public.mv_hubble_sc_fdw WHERE '1' = (public.mv_hubble_sc_fdw."pos" @ spoly('{' || spoint(radians(11.433558137430037),radians(41.78051021988508)) || ',' || spoint(radians(10.685606987908637),radians(41.78291729012794)) || ',' || spoint(radians(9.937655789597498),radians(41.78052176464371)) || ',' || spoint(radians(9.943560916211448),radians(41.267043854985126)) || ',' || spoint(radians(9.949314821366702),radians(40.75356532335342)) || ',' || spoint(radians(10.685592956764461),radians(40.75592346189901)) || ',' || spoint(radians(11.42187104951722),radians(40.75355395876293)) || ',' || spoint(radians(11.427638800815389),radians(41.26703240193894)) || '}')) Limit 100000 [2016-02-19 11:12:18 CET-ammiapp01.n1data.lan(46369)]LOG: duration: 11716.126 ms execute <unnamed>: SELECT COUNT(public.mv_hst_observation_fdw."observation_id") AS COUNT FROM public.mv_hst_observation_fdw WHERE '1' = (public.mv_hst_observation_fdw."fov" && spoly('{' || spoint(radians(11.433558137430037),radians(41.78051021988508)) || ',' || spoint(radians(10.685606987908637),radians(41.78291729012794)) || ',' || spoint(radians(9.937655789597498),radians(41.78052176464371)) || ',' || spoint(radians(9.943560916211448),radians(41.267043854985126)) || ',' || spoint(radians(9.949314821366702),radians(40.75356532335342)) || ',' || spoint(radians(10.685592956764461),radians(40.75592346189901)) || ',' || spoint(radians(11.42187104951722),radians(40.75355395876293)) || ',' || spoint(radians(11.427638800815389),radians(41.26703240193894)) || '}'))
  26. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 26 ESA UNCLASSIFIED - Releasable to the Public 3-Performance projects Ø  pgbench -- run a benchmark test on PostgreSQL Ø  Pgtune
  27. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 27 ESA UNCLASSIFIED - Releasable to the Public PostgreSQL Current nodes: Main features Number of nodes 2 RAM 512G Storage 2.9TB SSD Operating System RHEL 6 PostgreSQL release 9.5.2 Database Size 54G (50TB expected in 5 year with Gaia Data)
  28. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 28 ESA UNCLASSIFIED - Releasable to the Public High Availability and load balancing-I http://www.postgresql.org/docs/current/interactive/high-availability.html Pgpool-II Replication: mmi1 mmi2 RO 9.5.2 RW 9.5.2 Load balancer pgpool-II
  29. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 29 ESA UNCLASSIFIED - Releasable to the Public High Availability and load balancing-II Hot standby: 1-Ability to connect to the server and run read-only queries while the server is in archive recovery or standby mode. 2-If hot_standby is turned on in postgresql.conf and there is a recovery.conf file present, the server will run in Hot Standby mode http://www.postgresql.org/docs/current/interactive/hot-standby.html mmi1 mmi2 RO 9.5.2 RW 9.5.2
  30. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 30 ESA UNCLASSIFIED - Releasable to the Public High Availability and load balancing-III mmi1 mmi2 RW 9.5.2 RW 9.5.2 Ingestion App Server1 App Server2
  31. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 31 ESA UNCLASSIFIED - Releasable to the Public What else can be improved?
  32. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 32 ESA UNCLASSIFIED - Releasable to the Public Release timeline 25th September: Tech-Talk @ ESAC and first internal release to ESAC for feedback 26th October: Focus demo @ ADASS2015 and first public beta release Mid 2016: final release Continuous releases
  33. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 33 ESA UNCLASSIFIED - Releasable to the Public Technology roadmap Beta release @ ADASS2015 • Scientific validation of footprints and ESA all- sky HiPS • Download management • Multi-target summary table • Documentation • Integration with Simbad search engine • Helpdesk Support • Hardware scaling requirements First release (Mid 2016) • Interoperability with VO tools • SAMP integration • Refactoring of some components • DB High availability • DB load balancing • Video tutorials Continuous releases • Link to Vizier • Generation of detailed footprints (spectra) • Imaging and spectroscopic data • Online visualization of data • Time-series • Observation planning • Massive data visualization • Virtual Reality • Solar objects using PostGIS
  34. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 34 ESA UNCLASSIFIED - Releasable to the Public Data integration roadmap Beta release @ ADASS2015 • All-sky HiPS mosaics • INTEGRAL • XMM-Newton • HST • ISO • AKARI • Herschel • Planck • Science ready data (imaging) • XMM-Newton • XMM OM(UV and UVB) • HST(core) • Herschel • ISOCAM • Catalogs • INTEGRAL • 3XMM EPIC • XMM Slew • XMM OM • Tycho-2 • Hubble • Planck First release (Mid 2016) • All-sky HiPS mosaics • EXOSAT • Science ready data (imaging) • EXOSAT • INTEGRAL • XMM-Newton • HST • Herschel • Suzaku(JAXA) • Catalogs • Hipparcos • AKARI • Hubble • Planck • 3XMM EPIC • XMM Slew • XMM OM Continuous releases • All-sky HiPS mosaics • Future HiPS • Science ready data (imaging and spectra) • EXOSAT • INTEGRAL • XMM-Newton • HST • ISOCAM • Herschel • Future missions • Catalogs • 3XMM EPIC • XMM Slew • XMM OM • Hubble • Planck • Gaia • Herschel • Future catalogues
  35. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 35 ESA UNCLASSIFIED - Releasable to the Public Conclusion ESA Sky http://archives.esac.esa.int/esasky-beta •  Visualization web tool of all ESA (and others) space science data from one interface •  Built on top of the existing ESA science archives •  Access to final best data products from ESA space science missions, through VO protocols Using PostgreSQL
  36. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 36 ESA UNCLASSIFIED - Releasable to the Public So we will continue…. All that heaven will allow
  37. Pilar de Teodoro| PGCon’US 16 | Brooklyn | 19/04/2016 |

    Slide 37 ESA UNCLASSIFIED - Releasable to the Public Questions? Team Fabrizio Giordano (Software Engineer and Key Person), Jesús Salgado (Technical astro lead), Bruno Merín (Product Owner), Deborah Baines, Belén López Martí (HiPS generation), María H. Sarmiento (Software Engineer), Elena Racero (HiPS and footprints), Raúl Gutiérrez (Software Engineer), Pilar de Teodoro (DB administrator), Sara Nieto (Sotware Engineer) Previous Collaborators (MMI prototype) Iñaki Ortiz, Ignacio León, Andy Pollock, Michael Rosa, Javier Castellanos Acknowledge Pierre Fernique and Thomas Boch from CDS, expert science and technical staff at ESAC Your feedback is really appreciated! https://support.cosmos.esa.int/esdc/ Or directly to: [email protected]