Upgrade to Pro — share decks privately, control downloads, hide ads and more …

MapD Workshop: Visualizing Billions Of Data Poi...

OmniSci
August 25, 2018

MapD Workshop: Visualizing Billions Of Data Points With GPUs

There has been an explosion of geospatial data sources and collection in recent years, and with it the need for technologies that data analysts can use to query and visualize those very large data sets interactively, exploring and taking action on the data in real-time. In this hands-on tutorial, each attendee will be given a geospatial data set to work with, which they will use to ingest into MapD. By the end of this workshop, you'll learn how to query extremely large geospatial data and visualize it.

In this workshop, participants will be given access to a MapD instance, and hand-on instruction in the following:

-How to install and setup the MapD platform in the cloud, or on prem.
-How to import a geospatial dataset, either one of our samples or one of your own.
-How to create dashboards with geospatial charts in MapD Immerse.
-How to create multi-layer maps, combining data from multiple sources.
-How to update dashboards and share them.

OmniSci

August 25, 2018
Tweet

More Decks by OmniSci

Other Decks in Technology

Transcript

  1. © MapD 2018 Aaron Williams VP of Global Community @_arw_

    [email protected] /in/aaronwilliams/ /williamsaaron slides: https://speakerdeck.com/mapd
  2. © MapD 2018 Do This Now If you want to

    participate in the tutorials, sign up for a free trial account on MapD Cloud http://mapd.cloud 3
  3. Core Density Makes a Huge Difference 4 GPU Processing CPU

    Processing 40,000 Cores 20 Cores *fictitious example Latency Throughput CPU 1 ns per task (1 task/ns) x (20 cores) = 20 tasks/ns GPU 10 ns per task (0.1 task per ns) x (40,000 cores) = 4,000 task per ns Latency: Time to do a task. | Throughput: Number of tasks per unit time.
  4. © MapD 2018 Advanced memory management Three-tier caching to GPU

    RAM for speed and to SSDs for persistent storage 7 SSD or NVRAM STORAGE (L3) 250GB to 20TB 1-2 GB/sec CPU RAM (L2) 32GB to 3TB 70-120 GB/sec GPU RAM (L1) 24GB to 256GB 1000-6000 GB/sec Hot Data Speedup = 1500x to 5000x Over Cold Data Warm Data Speedup = 35x to 120x Over Cold Data Cold Data COMPUTE LAYER STORAGE LAYER Data Lake/Data Warehouse/System Of Record
  5. © MapD 2018 Ibis Interface Scaling the familiar pandas DataFrame

    API into the billions of records at interactive speed https://www.mapd.com/blog/scaling-pandas-t o-the-billions-with-ibis-and-mapd/ 10
  6. © MapD 2018 LIDAR in 3D with deck.gl Check out

    our custom app to visualize large, complex data sets like LIDAR https://www.mapd.com/blog/3d-lidar-with-m apd-and-ubers-deck-gl/ 11
  7. © MapD 2018 Last Chance ... If you want to

    participate in the tutorials, sign up for a free trial account on MapD Cloud http://mapd.cloud 13
  8. © MapD 2018 Step 1: MapD Immerse Basics 1. View

    your data in the Data Manager 2. Import data a. Local CSV b. S3 Bucket 3. View your dashboards 4. Create a new dashboard a. SAVE! 14
  9. © MapD 2018 Geospatial Objects Type Description POINT A point

    described by two coordinates. LINESTRING A sequence of 2 or more points and the lines that connect them. POLYGON A set of one or more rings (closed line strings), with the first representing the shape (external ring) and the rest representing holes in that shape (internal rings) MULTIPOLYGON A set of one or more polygons.
  10. © MapD 2018 Step 2: Loading MapD Shapefiles into Immerse

    1. Polygons SF City and County Subdivision Parcels MULTIPOLYGONS in GeoJSON https://s3.amazonaws.com/mapd-data/geodata/citylots.json 2. Points SF City-owned Critical Facilities POINTS in ESRI Shapefile https://s3.amazonaws.com/mapd-data/geodata/sffacs_current.zip 16
  11. © MapD 2018 Aaron Williams VP of Global Community @_arw_

    [email protected] /in/aaronwilliams/ /williamsaaron slides: https://speakerdeck.com/mapd Thank you! Questions?