Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Geospatial CSV Imports Hidden Complexity
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Kartones
October 27, 2015
Programming
0
55
Geospatial CSV Imports Hidden Complexity
@ Mindcamp 7.0 2015, Madrid
Kartones
October 27, 2015
Tweet
Share
More Decks by Kartones
See All by Kartones
Building Autonomous Agents with gym-retro
kartones
0
45
Python static typing with MyPy
kartones
0
71
High-impact refactors keeping the lights on
kartones
0
67
Remote Work
kartones
0
94
Intro to GameBoy Development
kartones
0
100
Myths & The Real World of OpenSource Development
kartones
0
49
CartoDB Tech Intro
kartones
0
49
Copy Protection & Cracking History
kartones
0
130
Cómo ganar dinero con tus juegos online
kartones
1
120
Other Decks in Programming
See All in Programming
なるべく楽してバックエンドに型をつけたい!(楽とは言ってない)
hibiki_cube
0
130
AIで開発はどれくらい加速したのか?AIエージェントによるコード生成を、現場の評価と研究開発の評価の両面からdeep diveしてみる
daisuketakeda
1
930
CSC307 Lecture 01
javiergs
PRO
0
680
HTTPプロトコル正しく理解していますか? 〜かわいい猫と共に学ぼう。ฅ^•ω•^ฅ ニャ〜
hekuchan
2
670
Architectural Extensions
denyspoltorak
0
250
Vibe Coding - AI 驅動的軟體開發
mickyp100
0
160
GISエンジニアから見たLINKSデータ
nokonoko1203
0
200
AtCoder Conference 2025
shindannin
0
1k
16年目のピクシブ百科事典を支える最新の技術基盤 / The Modern Tech Stack Powering Pixiv Encyclopedia in its 16th Year
ahuglajbclajep
5
950
AI Agent Tool のためのバックエンドアーキテクチャを考える #encraft
izumin5210
6
1.7k
Kotlin Multiplatform Meetup - Compose Multiplatform 외부 의존성 아키텍처 설계부터 운영까지
wisemuji
0
180
ゆくKotlin くるRust
exoego
1
220
Featured
See All Featured
Introduction to Domain-Driven Design and Collaborative software design
baasie
1
570
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
254
22k
The Impact of AI in SEO - AI Overviews June 2024 Edition
aleyda
5
720
ReactJS: Keep Simple. Everything can be a component!
pedronauck
666
130k
Designing Experiences People Love
moore
144
24k
KATA
mclloyd
PRO
34
15k
B2B Lead Gen: Tactics, Traps & Triumph
marketingsoph
0
48
Ecommerce SEO: The Keys for Success Now & Beyond - #SERPConf2024
aleyda
1
1.8k
Learning to Love Humans: Emotional Interface Design
aarron
275
41k
The #1 spot is gone: here's how to win anyway
tamaranovitovic
2
920
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.3k
4 Signs Your Business is Dying
shpigford
187
22k
Transcript
@Kartones GEOSPATIAL CSV IMPORTS HIDDEN COMPLEXITY
@Kartones CartoDB
@Kartones Agenda 1) CSV Format Issues 2) Import Issues
@Kartones CSV FORMAT ISSUES
@Kartones Intro .csv / MIME:text/csv Unknown birthdate (80s?) RFC 4180
(2005)
@Kartones Intro Plain text Simple format Simple rules
@Kartones Usage
@Kartones CSV 0101000020E610000000000000008049C000000000000038C0,1083 "alien",2014-11-04 15:24:40.43413+00 category 1, "jump jump up!",
{""value"":""es""}
@Kartones WKT: Well-Known Text POINT (30 10) LINESTRING (30 10,
10 30, 40 40) POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10)) MULTIPOINT ((10 40), (40 30), (20 20), (30 10)) MULTIPOLYGON (((30 20, 45 40, 10 40, 30 20)), ((15 5, 40 10, 10 20, 5 10, 15 5))) https://en.wikipedia.org/wiki/Well-known_text
@Kartones WKB: Well-Known Binary POINT(2.0 4.0) = 000000000140000000000000004010000000000000 https://en.wikipedia.org/wiki/Well-known_text#Well-known_binary
@Kartones GeoJSON { "type": "Feature", "geometry": { "type": "Point", "coordinates":
[125.6, 10.1] }, "properties": { "name": "Dinagat Islands" } } http://geojson.org/
@Kartones IMPORT ISSUES
@Kartones Typical Huge files (>1GB) Lots of rows (+2M) Lots
of columns (~1600) XLS/XLSX -> CSV
@Kartones Typical Stream HTTP downloaded file Stream file between servers
Stream data import to DB
@Kartones Typical
@Kartones CartoDB-specific Content guessing (e.g. lat/lon) Type guessing Geometry errors
fixing Sync tables -> No downtime allowed
@Kartones DB-Specific Leave DB indexes as last step Prefer big
INSERT to multiple UPDATE GDAL’s ogr2ogr > Ruby/Python scripts http://www.gdal.org/ogr2ogr.html
@Kartones Questions? Thanks!
[email protected]