Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Geospatial CSV Imports Hidden Complexity
Search
Kartones
October 27, 2015
Programming
0
55
Geospatial CSV Imports Hidden Complexity
@ Mindcamp 7.0 2015, Madrid
Kartones
October 27, 2015
Tweet
Share
More Decks by Kartones
See All by Kartones
Building Autonomous Agents with gym-retro
kartones
0
45
Python static typing with MyPy
kartones
0
71
High-impact refactors keeping the lights on
kartones
0
67
Remote Work
kartones
0
94
Intro to GameBoy Development
kartones
0
100
Myths & The Real World of OpenSource Development
kartones
0
49
CartoDB Tech Intro
kartones
0
49
Copy Protection & Cracking History
kartones
0
130
Cómo ganar dinero con tus juegos online
kartones
1
120
Other Decks in Programming
See All in Programming
Oxlintはいいぞ
yug1224
5
1.3k
Basic Architectures
denyspoltorak
0
680
2026年 エンジニアリング自己学習法
yumechi
0
140
組織で育むオブザーバビリティ
ryota_hnk
0
180
開発者から情シスまで - 多様なユーザー層に届けるAPI提供戦略 / Postman API Night Okinawa 2026 Winter
tasshi
0
200
AI によるインシデント初動調査の自動化を行う AI インシデントコマンダーを作った話
azukiazusa1
1
730
15年続くIoTサービスのSREエンジニアが挑む分散トレーシング導入
melonps
2
200
SourceGeneratorのススメ
htkym
0
200
Oxlint JS plugins
kazupon
1
960
AWS re:Invent 2025参加 直前 Seattle-Tacoma Airport(SEA)におけるハードウェア紛失インシデントLT
tetutetu214
2
110
そのAIレビュー、レビューしてますか? / Are you reviewing those AI reviews?
rkaga
6
4.6k
ぼくの開発環境2026
yuzneri
0
230
Featured
See All Featured
Automating Front-end Workflow
addyosmani
1371
200k
Un-Boring Meetings
codingconduct
0
200
Facilitating Awesome Meetings
lara
57
6.8k
HDC tutorial
michielstock
1
380
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
55
3.2k
The agentic SEO stack - context over prompts
schlessera
0
640
Between Models and Reality
mayunak
1
190
We Have a Design System, Now What?
morganepeng
54
8k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
141
34k
AI in Enterprises - Java and Open Source to the Rescue
ivargrimstad
0
1.1k
How People are Using Generative and Agentic AI to Supercharge Their Products, Projects, Services and Value Streams Today
helenjbeal
1
120
Bridging the Design Gap: How Collaborative Modelling removes blockers to flow between stakeholders and teams @FastFlow conf
baasie
0
450
Transcript
@Kartones GEOSPATIAL CSV IMPORTS HIDDEN COMPLEXITY
@Kartones CartoDB
@Kartones Agenda 1) CSV Format Issues 2) Import Issues
@Kartones CSV FORMAT ISSUES
@Kartones Intro .csv / MIME:text/csv Unknown birthdate (80s?) RFC 4180
(2005)
@Kartones Intro Plain text Simple format Simple rules
@Kartones Usage
@Kartones CSV 0101000020E610000000000000008049C000000000000038C0,1083 "alien",2014-11-04 15:24:40.43413+00 category 1, "jump jump up!",
{""value"":""es""}
@Kartones WKT: Well-Known Text POINT (30 10) LINESTRING (30 10,
10 30, 40 40) POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10)) MULTIPOINT ((10 40), (40 30), (20 20), (30 10)) MULTIPOLYGON (((30 20, 45 40, 10 40, 30 20)), ((15 5, 40 10, 10 20, 5 10, 15 5))) https://en.wikipedia.org/wiki/Well-known_text
@Kartones WKB: Well-Known Binary POINT(2.0 4.0) = 000000000140000000000000004010000000000000 https://en.wikipedia.org/wiki/Well-known_text#Well-known_binary
@Kartones GeoJSON { "type": "Feature", "geometry": { "type": "Point", "coordinates":
[125.6, 10.1] }, "properties": { "name": "Dinagat Islands" } } http://geojson.org/
@Kartones IMPORT ISSUES
@Kartones Typical Huge files (>1GB) Lots of rows (+2M) Lots
of columns (~1600) XLS/XLSX -> CSV
@Kartones Typical Stream HTTP downloaded file Stream file between servers
Stream data import to DB
@Kartones Typical
@Kartones CartoDB-specific Content guessing (e.g. lat/lon) Type guessing Geometry errors
fixing Sync tables -> No downtime allowed
@Kartones DB-Specific Leave DB indexes as last step Prefer big
INSERT to multiple UPDATE GDAL’s ogr2ogr > Ruby/Python scripts http://www.gdal.org/ogr2ogr.html
@Kartones Questions? Thanks!
[email protected]