Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Geospatial CSV Imports Hidden Complexity
Search
Kartones
October 27, 2015
Programming
61
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Geospatial CSV Imports Hidden Complexity
@ Mindcamp 7.0 2015, Madrid
Kartones
October 27, 2015
More Decks by Kartones
See All by Kartones
Building Autonomous Agents with gym-retro
kartones
0
52
Python static typing with MyPy
kartones
0
84
High-impact refactors keeping the lights on
kartones
0
76
Remote Work
kartones
0
100
Intro to GameBoy Development
kartones
0
110
Myths & The Real World of OpenSource Development
kartones
0
53
CartoDB Tech Intro
kartones
0
57
Copy Protection & Cracking History
kartones
0
140
Cómo ganar dinero con tus juegos online
kartones
1
130
Other Decks in Programming
See All in Programming
Javaの型とAI時代に型が大事な理由 / java types and type in AI era
kishida
2
130
コンテキストの使い捨てをやめる — ビジネスルール駆動開発と miko —
ioki
0
190
Creating Composable Callables in Contemporary C++
rollbear
0
110
Observability in Practice:Grafana 與 Edge Device SRE 的那些事
blueswen
0
160
フロントエンドとバックエンドで「1文字」を揃えよう
youkidearitai
PRO
0
500
AIチームを指揮するOSS「TAKT」活用術 / How to Use “TAKT,” an OSS Tool for Orchestrating AI Teams
nrslib
6
890
Claspは野良GASの夢をみるか
takter00
0
190
Java × distroless で 軽量なコンテナイメージを / Java on Distroless
contour_gara
0
540
dRuby over BLE
makicamel
2
330
RTSPクライアントを自作してみた話
simotin13
0
600
Spring Security 実践 ─ GraphQL APIで実務に役立つ 認証・認可 を学ぶ
wagyu
0
230
軽量Java基盤の設計 DIコンテナに頼らない、長期保守と1秒起動の実現 JJUG CCC 2026 Spring
macha64
0
510
Featured
See All Featured
How to Get Subject Matter Experts Bought In and Actively Contributing to SEO & PR Initiatives.
livdayseo
0
140
Raft: Consensus for Rubyists
vanstee
141
7.5k
ラッコキーワード サービス紹介資料
rakko
1
3.6M
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
16k
The Limits of Empathy - UXLibs8
cassininazir
1
360
Lightning Talk: Beautiful Slides for Beginners
inesmontani
PRO
2
570
JAMstack: Web Apps at Ludicrous Speed - All Things Open 2022
reverentgeek
1
470
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.7k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
508
140k
New Earth Scene 8
popppiees
3
2.3k
The agentic SEO stack - context over prompts
schlessera
0
820
SEO in 2025: How to Prepare for the Future of Search
ipullrank
3
3.5k
Transcript
@Kartones GEOSPATIAL CSV IMPORTS HIDDEN COMPLEXITY
@Kartones CartoDB
@Kartones Agenda 1) CSV Format Issues 2) Import Issues
@Kartones CSV FORMAT ISSUES
@Kartones Intro .csv / MIME:text/csv Unknown birthdate (80s?) RFC 4180
(2005)
@Kartones Intro Plain text Simple format Simple rules
@Kartones Usage
@Kartones CSV 0101000020E610000000000000008049C000000000000038C0,1083 "alien",2014-11-04 15:24:40.43413+00 category 1, "jump jump up!",
{""value"":""es""}
@Kartones WKT: Well-Known Text POINT (30 10) LINESTRING (30 10,
10 30, 40 40) POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10)) MULTIPOINT ((10 40), (40 30), (20 20), (30 10)) MULTIPOLYGON (((30 20, 45 40, 10 40, 30 20)), ((15 5, 40 10, 10 20, 5 10, 15 5))) https://en.wikipedia.org/wiki/Well-known_text
@Kartones WKB: Well-Known Binary POINT(2.0 4.0) = 000000000140000000000000004010000000000000 https://en.wikipedia.org/wiki/Well-known_text#Well-known_binary
@Kartones GeoJSON { "type": "Feature", "geometry": { "type": "Point", "coordinates":
[125.6, 10.1] }, "properties": { "name": "Dinagat Islands" } } http://geojson.org/
@Kartones IMPORT ISSUES
@Kartones Typical Huge files (>1GB) Lots of rows (+2M) Lots
of columns (~1600) XLS/XLSX -> CSV
@Kartones Typical Stream HTTP downloaded file Stream file between servers
Stream data import to DB
@Kartones Typical
@Kartones CartoDB-specific Content guessing (e.g. lat/lon) Type guessing Geometry errors
fixing Sync tables -> No downtime allowed
@Kartones DB-Specific Leave DB indexes as last step Prefer big
INSERT to multiple UPDATE GDAL’s ogr2ogr > Ruby/Python scripts http://www.gdal.org/ogr2ogr.html
@Kartones Questions? Thanks!
[email protected]