Upgrade to Pro — share decks privately, control downloads, hide ads and more …

FOSS4GNA | MapD for Analysts: Visualizing Your Geospatial Data on GPUs

FOSS4GNA | MapD for Analysts: Visualizing Your Geospatial Data on GPUs

There has been an explosion of geospatial data soures and collection in recent years, and with it the need for technologies that data analysts can use to query and visualize those very large data sets interactively, exploring and taking action on the data in real-time. Using the open source MapD SQL engine and MapD analytics platform, you can eliminate waiting for slow reports, downsampling, and pre-aggregating, and put the power of the GPU to use in rendering your complex geospatial charts and plots.

In this workshop, participants will be given access to a MapD instance, and hand-on instruction in the following:

How to install and setup the MapD platform in the cloud, or on prem.
How to import a geospatial dataset, either one of our samples or one of your own.
How to create dashboards with geospatial charts in MapD Immerse.
How to create multi-layer maps, combining data from multiple sources.
How to update dashboards and share them.
Want to experience for yourself the power of GPUs to visualize very large geospatial datasets with instantaneous interaction? This is the workshop for you.

OmniSci

May 17, 2018
Tweet

More Decks by OmniSci

Other Decks in Technology

Transcript

  1. © MapD 2018 Veda Shankar Sr Developer Advocate , MapD

    Community [email protected] slides: https://speakerdeck.com/mapd
  2. © MapD 2018 Categories of common MapD use cases 7

    Operational Analytics • Thwart Banking Fraud • Scan for Cyber Threats • Fine-tune Advertising • Maintain the Utility Grid Geospatial Analytics • Monitor Networks • Ready Logistics • Forecast Micro-weather Data Science • Model Financial Markets • Predict Maintenance • Predict Staffing Levels
  3. © MapD 2018 Advanced memory management Three-tier caching to GPU

    RAM for speed and to SSDs for persistent storage 8 SSD or NVRAM STORAGE (L3) 250GB to 20TB 1-2 GB/sec CPU RAM (L2) 32GB to 3TB 70-120 GB/sec GPU RAM (L1) 24GB to 256GB 1000-6000 GB/sec Hot Data Speedup = 1500x to 5000x Over Cold Data Warm Data Speedup = 35x to 120x Over Cold Data Cold Data COMPUTE LAYER STORAGE LAYER Data Lake/Data Warehouse/System Of Record
  4. © MapD 2018 The GPU Open Analytics Initiative (GOAI) Seamless

    data interchange framework in GPU memory 9
  5. © MapD 2018 The GPU Open Analytics Initiative (GOAI) Creating

    common data frameworks to accelerate data science on GPUs 1 0 /mapd/pymapd /gpuopenanalytics/pygdf
  6. © MapD 2018 Machine Learning Pipeline 11 Personas in Analytics

    Lifecycle (Illustrative) Business Analyst Data Scientist Data Engineer IT Systems Admin Data Scientist / Business Analyst Data Preparation Data Discovery & Feature Engineering Model & Validate Predict Operationalize Monitoring & Refinement Evaluate & Decide GPUs
  7. © MapD 2018 • We’ve published a few notebooks showing

    how to connect to a MapD database and use an ML algorithm to make predictions 12 Github ML Examples /gpuopenanalytics/demo-docker /mapd/mapd-ml-demo
  8. © MapD 2018 © MapD 2018 1 4 Tutorial :

    MapD Fundamentals Exploring Google Analytics with MapD
  9. © MapD 2018 Tutorial - MapD Fundamentals Agenda • Launch

    MapD Cloud Instance • Run a python program to download website analytics data • Upload the data to MapD • Create dashboard with different charts in MapD Immerse • Learn about MapD Crossfilter feature Software Prerequisite • pip install --upgrade google-api-python-client • pip install matplotlib pandas python-gflags • pip install pymapd • git clone https://github.com/mapd/google-analytics.git • https://s3-us-west-1.amazonaws.com/sfpyworkshop/client_secrets.json-metalr eaction
  10. © MapD 2018 Tutorial - MapD Fundamentals Run Google Analytics

    Importer Application • cd google-analytics • cp ~/Downloads/client_secrets.json-metalreaction client_secrets.json • python mapd_ga_data.py • Select the profile view (#2) and the date range (2013-01-01 2017-01-01) • A successful execution creates the file www.metalreaction.com_metalreaction2.csv.gz under the ./data folder MapD Immerse Import Wizard Use MapD Immerse’s table import feature to manually load the data from the CSV file into MapD Core Database table. Open MapD Immerse by launching your MapD Cloud instance. Click Data Manager -> Click Import Data -> Click the + sign or drag-and-drop the gzipped CSV file for upload. Create Dashboard On MapD Immerse, select the DASHBOARDS tab and click New Dashboard. Then press Add Chart (+ sign) and you can select from a wide variety of chart types using the different dimensions and measures from your table. Measures are quantitative or numerical values or aggregated values (SUM, AVG etc). Dimensions categorize measures and used for grouping of data.
  11. © MapD 2018 Tutorial - MapD Fundamentals Creating MapD Dashboard

    from Template NOTE: Import/Export of dashboard requires MapD shell access (mapdql) You can export your Dashboard and save the template as a JSON file. • mapdql> \export_dashboard GAWebsiteAnalytics /tmp/GA_sample_dashboard.template In the following example, we will download a dashboard template for Google Analytics and then substitute the dashboard name and table name in the template file using sed. Then we will import the template file to create a dashboard which will use the data from table we just ingested. • curl -O https://s3-us-west-1.amazonaws.com/mapd-cloud/Dashboards/GA_sample_dashboard.template • cp GA_sample_dashboard.template /tmp/GA_sample_dashboard.new • sed -i -e s/SAMPLE_TABLE_NAME/metal/g /tmp/GA_sample_dashboard.new • sed -i -e s/SAMPLE_DASHBOARD_NAME/metal_dash/g /tmp/GA_sample_dashboard.new • mapdql> \import_dashboard metal2_dash /tmp/GA_sample_dashboard.new
  12. © MapD 2018 Geospatial Objects Type Description POINT A point

    described by two coordinates. LINESTRING A sequence of 2 or more points and the lines that connect them. POLYGON A set of one or more rings (closed line strings), with the first representing the shape (external ring) and the rest representing holes in that shape (internal rings) MULTIPOLYGON A set of one or more polygons.
  13. © MapD 2018 Geospatial Table Table Creation It’s now possible

    to create tables with geospatial columns CREATE TABLE geo1 ( p1 POINT, l1 LINESTRING, poly1 POLYGON, mpoly1 MULTIPOLYGON); By default, geospatial objects are created as geometries (planar spatial data types). Type definition can be extended to include the spatial reference identification (SRID) and compression mode information. CREATE TABLE geo2 ( p2 GEOMETRY(POINT), l2 GEOMETRY(LINESTRING, 900913), poly2 GEOMETRY(POLYGON, 4326) ENCODING NONE, mpoly2 GEOMETRY(MULTIPOLYGON, 4326) ENCODING GEOINT(32));
  14. © MapD 2018 Geospatial Table Table Creation mapdql> CREATE TABLE

    geo (pz POINT, p POINT, l LINESTRING) WITH (fragment_size=2); mapdql> \d geo CREATE TABLE geo ( pz GEOMETRY(POINT), p GEOMETRY(POINT), l GEOMETRY(LINESTRING)) WITH (FRAGMENT_SIZE = 2)
  15. © MapD 2018 Importing Geospatial Data INSERT One way is

    through INSERT, using WKT string values: mapdql> \d geo CREATE TABLE geo ( p POINT, l LINESTRING, poly POLYGON) mapdql> insert into geo values('POINT(20 20)', 'LINESTRING(40 0, 40 40)', 'POLYGON(( 0 0, 40 0, 40 40, 0 40, 0 0 ))'); COPY The COPY command is the fastest option for importing geospatial data, and binary shape files and GEOJSON files at this time. COPY table_name FROM full_path WITH (Geo=‘True’)
  16. © MapD 2018 Importing Geospatial Data COPY mapdql> COPY footable

    FROM '/tmp/vs/sffacs_current.zip' WITH (geo='true'); Result Creating table 'footable' and importing geo... mapdql> \d footable CREATE TABLE haha1 ( facility_i TEXT ENCODING DICT(32), facility_n TEXT ENCODING DICT(32), deptname TEXT ENCODING DICT(32), dept INTEGER, mapd_geo GEOGRAPHY(POINT, 4326) ENCODING GEOINT(32)) MapD Immerse You can also use MapD Immerse UI. Click Data Manager -> Click Import Data -> Click the + sign or drag-and-drop to import the supported geo format files.
  17. © MapD 2018 Geospatial Functions Geometry Constructor Functions Description ST_GeomFromText

    Return a specified geometry from Well-Known Text representation (WKT) ST_GeogFromText Return a specified geography from Well-Known Text representation (WKT)
  18. © MapD 2018 Geospatial Functions Geometry Editor Functions Description ST_Transform

    Return a new geometry with its coordinates transformed to a different spatial reference. The only supported transform in this release is WGS84 to Web Mercator, e.g. ST_Transform(ST_GeogFromText('POINT(-71.064544 42.28787)',4326),900913) ST_SetSRID Set the SRID on a geometry to a particular integer value.
  19. © MapD 2018 Geospatial Functions Geometry Accessor Functions Description ST_XMin

    Returns X minima of a geometry. ST_XMax Returns X maxima of a geometry. ST_YMin Returns Y minima of a geometry. ST_YMax Returns Y maxima of a geometry. ST_StartPoint Returns the first point of a LINESTRING as a POINT. ST_EndPoint Returns the last point of a LINESTRING as a POINT. ST_PointN Return the Nth point of a LINESTRING as a POINT. ST_SRID Returns the spatial reference identifier for the underlying object
  20. © MapD 2018 Geospatial Functions Spatial Relationship and Measurement Functions

    Description ST_Distance Returns shortest planar distance between geometries. Returns shortest geodesic distance between geographies (in meters, limited support). ST_Contains Returns true if first geometry contains the second one.
  21. © MapD 2018 Geospatial Objects Type Description POINT A point

    described by two coordinates. LINESTRING A sequence of 2 or more points and the lines that connect them. POLYGON A set of one or more rings (closed line strings), with the first representing the shape (external ring) and the rest representing holes in that shape (internal rings) MULTIPOLYGON A set of one or more polygons.
  22. © MapD 2018 Geospatial Objects Type Description POINT A point

    described by two coordinates. LINESTRING A sequence of 2 or more points and the lines that connect them. POLYGON A set of one or more rings (closed line strings), with the first representing the shape (external ring) and the rest representing holes in that shape (internal rings) MULTIPOLYGON A set of one or more polygons.
  23. © MapD 2018 Geospatial Objects Type Description POINT A point

    described by two coordinates. LINESTRING A sequence of 2 or more points and the lines that connect them. POLYGON A set of one or more rings (closed line strings), with the first representing the shape (external ring) and the rest representing holes in that shape (internal rings) MULTIPOLYGON A set of one or more polygons.
  24. © MapD 2018 Geospatial Objects Type Description POINT A point

    described by two coordinates. LINESTRING A sequence of 2 or more points and the lines that connect them. POLYGON A set of one or more rings (closed line strings), with the first representing the shape (external ring) and the rest representing holes in that shape (internal rings) MULTIPOLYGON A set of one or more polygons.
  25. © MapD 2018 Importing Geospatial Data COPY mapdql> COPY footable

    FROM '/tmp/vs/sffacs_current.zip' WITH (geo='true'); Result Creating table 'footable' and importing geo... mapdql> \d footable CREATE TABLE haha1 ( facility_i TEXT ENCODING DICT(32), facility_n TEXT ENCODING DICT(32), deptname TEXT ENCODING DICT(32), dept INTEGER, mapd_geo GEOGRAPHY(POINT, 4326) ENCODING GEOINT(32)) MapD Immerse You can also use MapD Immerse UI. Click Data Manager -> Click Import Data -> Click the + sign or drag-and-drop to import the supported geo format files.
  26. © MapD 2018 Try Geospatial functions with mapdql mapdql> create

    table t1 (id INT, resultcol INT, pointcol POINT); mapdql> insert into t1 values(0, 100, 'POINT(0 0)'); mapdql> insert into t1 values(1, 101, 'POINT(1 1)'); mapdql> insert into t1 values(2, 102, 'POINT(2 2)'); mapdql> insert into t1 values(3, 103, 'POINT(3 3)'); mapdql> insert into t1 values(4, 104, 'POINT(4 4)'); mapdql> insert into t1 values(5, 105, 'POINT(5 5)'); mapdql> create table t2 (id INT); mapdql> insert into t2 values(-1); mapdql> insert into t2 values(1); mapdql> insert into t2 values(3); mapdql> insert into t2 values(5); mapdql>
  27. © MapD 2018 Try Geospatial functions with mapdql mapdql> SELECT

    t1.resultcol FROM t1 JOIN t2 ON t1.id = t2.id; mapdql> SELECT t1.resultcol FROM t1 JOIN t2 ON t1.id = t2.id; mapdql> SELECT t1.resultcol FROM t1 JOIN t2 ON t1.id = t2.id WHERE ST_DISTANCE(t1.pointcol, 'POINT(0 0)') > 2.0; mapdql> SELECT t1.resultcol FROM t1 JOIN t2 ON t1.id = t2.id WHERE ST_CONTAINS('POLYGON((4 0, 4 4, 0 4, 0 0, 4 0))', t1.pointcol); mapdql> SELECT t1.resultcol FROM t1 JOIN t2 ON t1.id = t2.id WHERE ST_CONTAINS('POLYGON((4 0, 4 4, 0 4, 0 0, 4 0), (2 0, 2 2, 0 2, 0 0, 2 0))', t1.pointcol); mapdql> SELECT t1.resultcol FROM t1 JOIN t2 ON t1.id = t2.id WHERE ST_CONTAINS('POLYGON((4 0, 4 4, 0 4, 0 0, 4 0), (2 0, 2 2, 0 2, 0 0, 2 0))', t1.pointcol) OR ST_DISTANCE(t1.pointcol, 'POINT(6 6)') < 1.5; mapdql> create table t3 (pointcol POINT); mapdql> insert into t3 values('POINT(2.5 2.5)'); mapdql> insert into t3 values('POINT(6 6)'); mapdql> SELECT t1.resultcol FROM t1 JOIN t3 ON ST_DISTANCE(t1.pointcol, t3.pointcol) < 1.45;
  28. © MapD 2018 © MapD 2018 3 6 Tutorial -

    Geospatial Rendering Using Vega spec for MapD backend rendering with MapBox GL JS
  29. © MapD 2018 Geospatial Rendering with Vega/MapBoxGL JS Software Prerequisite

    • git clone https://github.com/omveda/mapd-vega-mapboxgl-geo • Follow the instructions in the README to install software on your laptop. ❏ cd mapd-vega-mapboxgl-geo ❏ yarn install ❏ npm start MapD Vega is based on the open-source Vega Specification, it has been adapted to drive the rendering engine directly on the result set of a SQL query without ever requiring the data to leave the GPU. The MapD Connector API makes it easy to send the Vega JSON to the backend, which renders the visualization and returns a base64-encoded PNG image to the client. A Vega specification includes: • a data property that specifies and filters data source(s). • a marks property that defines the basic visualization graphic of a data item. • a scales property that defines geometry or applies additional attributes to the data item visualization. • viewing area dimensions.
  30. © MapD 2018 © MapD 2018 • community.mapd.com Ask questions

    and share your experiences • mapd.com/cloud Try 14-day free trial, no credit card needed • mapd.com/demos Play with our demos • mapd.com/platform/download-community/ Get our free Community Edition and start playing 38 Next Steps