Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Bulk + Open Data APIs
Search
Chris Herwig
April 04, 2013
Technology
280
2
Share
Bulk + Open Data APIs
Chris Herwig
April 04, 2013
More Decks by Chris Herwig
See All by Chris Herwig
Clear Skies: Turning Massive NASA Data into a Pixel-Perfect World Atlas
hrwgc
0
1.3k
Open + Accessible
hrwgc
2
140
Open Satellite Imagery and Geoportals | MapBox Satellite
hrwgc
1
230
Mapping Mars Open Source
hrwgc
1
98
Other Decks in Technology
See All in Technology
ECSのTerraformモジュールにコントリビュートした話
harukasakihara
0
250
AWS運用におけるAI Agent活用術 / JAWS-UG 神戸 #11 LT大会
genda
1
310
AIのために、AIを使った、Effect-TSからの脱却 〜テストを活用した安全なリファクタリングの進め方〜
bitkey
PRO
0
150
なぜ、IAMロールのプリンシパルに*による部分マッチングが使えないのか? / 20260518-ssmjp-iam-role-principal
opelab
2
140
TypeScriptで実現する既存APIを活用したリモートMCPサーバー構築 / TSKaigi 2026
soarteclab
0
130
Redmine次期バージョン7.0の注目新機能解説 — UI/UX強化と連携強化を中心に
vividtone
1
190
Pythonでベイズモデリング
soogie
0
130
Fラン学生が考える、AI時代のデザインに執着した突破口
husengs7
1
220
How to learn AWS Well-Architected with AWS BuilderCards: Security Edition
coosuke
PRO
0
180
マンション備え付けのネットワークとLTE回線を組み合わせた ネットワークの安定化の考案
harutiro
1
140
AIAgentと取り組むKaggle
508shuto
2
430
SpeechTranscriber + AIによる文字起こし機能
kazuki1220
0
120
Featured
See All Featured
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
55k
コードの90%をAIが書く世界で何が待っているのか / What awaits us in a world where 90% of the code is written by AI
rkaga
61
44k
A better future with KSS
kneath
240
18k
Leo the Paperboy
mayatellez
7
1.8k
HU Berlin: Industrial-Strength Natural Language Processing with spaCy and Prodigy
inesmontani
PRO
0
380
Taking LLMs out of the black box: A practical guide to human-in-the-loop distillation
inesmontani
PRO
3
2.2k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1.3k
Conquering PDFs: document understanding beyond plain text
inesmontani
PRO
4
2.7k
Ten Tips & Tricks for a 🌱 transition
stuffmc
0
110
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
52
5.9k
How to Talk to Developers About Accessibility
jct
2
200
Future Trends and Review - Lecture 12 - Web Technologies (1019888BNR)
signer
PRO
0
3.5k
Transcript
Bulk Chris Herwig @hrwgc +open
Chris Herwig Satellite Team lead, MapBox
MapBox Satellite Phase 1, Launched 12/2012 • Global imagery base
layer for MapBox users • Global satellite imagery, zoom 0-12 • Continental U.S. aerial imagery zoom 13-17 • Licensed to allow for OSM tracing
MapBox Satellite phase 1 was sourced entirely from public domain,
open data.
Kuala Lumpur, Malaysia
Los Angeles, CA -
Brawley, CA
Cloudless Atlas • Cloudfree global mosaic, zoom 0-8 • NASA
MODIS Aqua and Terra Satellites • 380,000 source satellite images
Open data is good.
“data is open if anyone is free to use, reuse,
and redistribute it ...”
“subject to the requirement to attribute and/or share-alike” -Open Knowledge
Definition
ACCESS
- open license - open format - available for download
ACCESS
assumptions 3+1
There are different types of open data users.
Different users have different needs and abilities.
Data accessibility matters.
Open data is not truly open if it is inaccessible.
USERS 3
CASUAL
casual •least technical •dataset discovery •basic needs: ability to query
and download
•geoportal •simple html table •solid metadata •intuitive interface casual
casual USGS EarthExplorer http://earthexplorer.usgs.gov
casual Massachusetts GIS http://gis.amherstma.gov/mgis/
casual The National Map http://nationalmap.gov
casual Utah AGRC Raster Data Discovery http://gis.utah.gov
casual New Hampshire Statewide GIS Clearinghouse http://www.granit.unh.edu/data/downloadfreedata/category/databycategory.html
PROGRAM MATIC
•Tech skills/API familiarity •spatial query •download sub-dataset based on parent
process programmatic
programmatic • API • developer documentation • solid metadata •
interface optional
USGS Application Services http://cumulus.cr.usgs.gov/app_services.php programmatic
USGS Application Services http://cumulus.cr.usgs.gov/app_services.php programmatic
BULK
bulk • Need entire datasets, not spatial intersections • Data
APIs/manual retrieval workflows do not scale • Sometimes retrieve data via physical drives
bulk • interface optional • FTP-like access • reasonable bandwidth
for download retrieval
New Hampshire Statewide GIS Clearinghouse http://www.granit.unh.edu/ Bulk
API
TYPES 3
CONTENT
ConTeNt Database REST Content
Content • Makes application content available for developers to integrate
into existing/new applications
Content
DATA
Database REST Matching Rows Data
DATA • Allows users to query large datasets without having
to have full dataset locally • Applications can be built on top of Live/real-time datasets
Data http://api.occupy-data.org/v1/? results&value=crossst&value=age&value=race&value=crimsusp&value=sex&value=build&value=frisked&results_p er_page=100
BULK
Bulk Database REST References
bulk • Key difference is user obtains reference to object
requested, rather than object itself. • Download object(s) later • Can be relatively lightweight
SO?
Data API = Best Open Data MetHOD?
NO.
APIs, like geoportals, are not always the best option for
disseminating open data.
Different USers
Different NEEds
Different Abilities
Different Access Endpoints
STUFF breaks
Permalinks != Permanent
WayBackMachine http://archive.org/web/web.php
So?
Open data users change as tech changes.
Access should be a policy and tech consideration.
NEXT STEPS
Strive to be SAD
SCALABLE Accessible Durable
- Open systems for access to open data - Can
grow in response to changes in technology/user requirements SCALABLE
- Data access and retrieval is as quick and painless
as possible - Options for users with different abilities, different desired results Accessible
- APIs, geoportals don’t always work - Low-maintenance, durable options
- FTP-like directory access - Good documentation DURABLE
San Francisco, CA
[email protected]
@hrwgc