Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Geospatial CSV Imports Hidden Complexity
Search
Kartones
October 27, 2015
Programming
0
55
Geospatial CSV Imports Hidden Complexity
@ Mindcamp 7.0 2015, Madrid
Kartones
October 27, 2015
Tweet
Share
More Decks by Kartones
See All by Kartones
Building Autonomous Agents with gym-retro
kartones
0
45
Python static typing with MyPy
kartones
0
71
High-impact refactors keeping the lights on
kartones
0
67
Remote Work
kartones
0
94
Intro to GameBoy Development
kartones
0
100
Myths & The Real World of OpenSource Development
kartones
0
49
CartoDB Tech Intro
kartones
0
49
Copy Protection & Cracking History
kartones
0
120
Cómo ganar dinero con tus juegos online
kartones
1
120
Other Decks in Programming
See All in Programming
公共交通オープンデータ × モバイルUX 複雑な運行情報を 『直感』に変換する技術
tinykitten
PRO
0
180
AIエージェントの設計で注意するべきポイント6選
har1101
6
2.8k
Spinner 軸ズレ現象を調べたらレンダリング深淵に飲まれた #レバテックMeetup
bengo4com
1
210
Rubyで鍛える仕組み化プロヂュース力
muryoimpl
0
260
PostgreSQLで手軽にDuckDBを使う!DuckDB&pg_duckdb入門/osc25hi-duckdb
takahashiikki
0
230
Graviton と Nitro と私
maroon1st
0
160
[AtCoder Conference 2025] LLMを使った業務AHCの上⼿な解き⽅
terryu16
6
980
ELYZA_Findy AI Engineering Summit登壇資料_AIコーディング時代に「ちゃんと」やること_toB LLMプロダクト開発舞台裏_20251216
elyza
2
860
Kotlin Multiplatform Meetup - Compose Multiplatform 외부 의존성 아키텍처 설계부터 운영까지
wisemuji
0
150
AI Agent Tool のためのバックエンドアーキテクチャを考える #encraft
izumin5210
5
1.5k
Cell-Based Architecture
larchanjo
0
150
AI前提で考えるiOSアプリのモダナイズ設計
yuukiw00w
0
210
Featured
See All Featured
State of Search Keynote: SEO is Dead Long Live SEO
ryanjones
0
80
AI: The stuff that nobody shows you
jnunemaker
PRO
1
40
Intergalactic Javascript Robots from Outer Space
tanoku
273
27k
Utilizing Notion as your number one productivity tool
mfonobong
2
190
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
1.8k
How To Stay Up To Date on Web Technology
chriscoyier
791
250k
The Director’s Chair: Orchestrating AI for Truly Effective Learning
tmiket
1
69
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3.3k
Stop Working from a Prison Cell
hatefulcrawdad
273
21k
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.5k
Navigating Algorithm Shifts & AI Overviews - #SMXNext
aleyda
0
1.1k
Facilitating Awesome Meetings
lara
57
6.7k
Transcript
@Kartones GEOSPATIAL CSV IMPORTS HIDDEN COMPLEXITY
@Kartones CartoDB
@Kartones Agenda 1) CSV Format Issues 2) Import Issues
@Kartones CSV FORMAT ISSUES
@Kartones Intro .csv / MIME:text/csv Unknown birthdate (80s?) RFC 4180
(2005)
@Kartones Intro Plain text Simple format Simple rules
@Kartones Usage
@Kartones CSV 0101000020E610000000000000008049C000000000000038C0,1083 "alien",2014-11-04 15:24:40.43413+00 category 1, "jump jump up!",
{""value"":""es""}
@Kartones WKT: Well-Known Text POINT (30 10) LINESTRING (30 10,
10 30, 40 40) POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10)) MULTIPOINT ((10 40), (40 30), (20 20), (30 10)) MULTIPOLYGON (((30 20, 45 40, 10 40, 30 20)), ((15 5, 40 10, 10 20, 5 10, 15 5))) https://en.wikipedia.org/wiki/Well-known_text
@Kartones WKB: Well-Known Binary POINT(2.0 4.0) = 000000000140000000000000004010000000000000 https://en.wikipedia.org/wiki/Well-known_text#Well-known_binary
@Kartones GeoJSON { "type": "Feature", "geometry": { "type": "Point", "coordinates":
[125.6, 10.1] }, "properties": { "name": "Dinagat Islands" } } http://geojson.org/
@Kartones IMPORT ISSUES
@Kartones Typical Huge files (>1GB) Lots of rows (+2M) Lots
of columns (~1600) XLS/XLSX -> CSV
@Kartones Typical Stream HTTP downloaded file Stream file between servers
Stream data import to DB
@Kartones Typical
@Kartones CartoDB-specific Content guessing (e.g. lat/lon) Type guessing Geometry errors
fixing Sync tables -> No downtime allowed
@Kartones DB-Specific Leave DB indexes as last step Prefer big
INSERT to multiple UPDATE GDAL’s ogr2ogr > Ruby/Python scripts http://www.gdal.org/ogr2ogr.html
@Kartones Questions? Thanks!
[email protected]