Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Geospatial CSV Imports Hidden Complexity
Search
Sponsored
·
SiteGround - Reliable hosting with speed, security, and support you can count on.
→
Kartones
October 27, 2015
Programming
0
56
Geospatial CSV Imports Hidden Complexity
@ Mindcamp 7.0 2015, Madrid
Kartones
October 27, 2015
Tweet
Share
More Decks by Kartones
See All by Kartones
Building Autonomous Agents with gym-retro
kartones
0
45
Python static typing with MyPy
kartones
0
71
High-impact refactors keeping the lights on
kartones
0
67
Remote Work
kartones
0
95
Intro to GameBoy Development
kartones
0
100
Myths & The Real World of OpenSource Development
kartones
0
49
CartoDB Tech Intro
kartones
0
50
Copy Protection & Cracking History
kartones
0
130
Cómo ganar dinero con tus juegos online
kartones
1
120
Other Decks in Programming
See All in Programming
Understanding Apache Lucene - More than just full-text search
spinscale
0
140
nuget-server - あなたが必要だったNuGetサーバー
kekyo
PRO
0
430
ふつうの Rubyist、ちいさなデバイス、大きな一年
bash0c7
0
1.1k
仕様漏れ実装漏れをなくすトレーサビリティAI基盤のご紹介
orgachem
PRO
7
3k
クライアントワークでSREをするということ。あるいは事業会社におけるSREと同じこと・違うこと
nnaka2992
1
360
Rで始めるML・LLM活用入門
wakamatsu_takumu
0
200
Goの型安全性で実現する複数プロダクトの権限管理
ishikawa_pro
2
1.2k
The free-lunch guide to idea circularity
hollycummins
0
330
[SF Ruby Feb'26] The Silicon Heel
palkan
0
120
API Platformを活用したPHPによる本格的なWeb API開発 / api-platform-book-intro
ttskch
1
150
AI活用のコスパを最大化する方法
ochtum
0
290
Reactive ❤️ Loom: A Forbidden Love Story
franz1981
2
140
Featured
See All Featured
Lightning talk: Run Django tests with GitHub Actions
sabderemane
0
150
WCS-LA-2024
lcolladotor
0
490
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
31
3.1k
Why Our Code Smells
bkeepers
PRO
340
58k
The SEO Collaboration Effect
kristinabergwall1
0
400
Fashionably flexible responsive web design (full day workshop)
malarkey
408
66k
Introduction to Domain-Driven Design and Collaborative software design
baasie
1
650
Stewardship and Sustainability of Urban and Community Forests
pwiseman
0
160
Documentation Writing (for coders)
carmenintech
77
5.3k
Site-Speed That Sticks
csswizardry
13
1.1k
From Legacy to Launchpad: Building Startup-Ready Communities
dugsong
0
180
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.6k
Transcript
@Kartones GEOSPATIAL CSV IMPORTS HIDDEN COMPLEXITY
@Kartones CartoDB
@Kartones Agenda 1) CSV Format Issues 2) Import Issues
@Kartones CSV FORMAT ISSUES
@Kartones Intro .csv / MIME:text/csv Unknown birthdate (80s?) RFC 4180
(2005)
@Kartones Intro Plain text Simple format Simple rules
@Kartones Usage
@Kartones CSV 0101000020E610000000000000008049C000000000000038C0,1083 "alien",2014-11-04 15:24:40.43413+00 category 1, "jump jump up!",
{""value"":""es""}
@Kartones WKT: Well-Known Text POINT (30 10) LINESTRING (30 10,
10 30, 40 40) POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10)) MULTIPOINT ((10 40), (40 30), (20 20), (30 10)) MULTIPOLYGON (((30 20, 45 40, 10 40, 30 20)), ((15 5, 40 10, 10 20, 5 10, 15 5))) https://en.wikipedia.org/wiki/Well-known_text
@Kartones WKB: Well-Known Binary POINT(2.0 4.0) = 000000000140000000000000004010000000000000 https://en.wikipedia.org/wiki/Well-known_text#Well-known_binary
@Kartones GeoJSON { "type": "Feature", "geometry": { "type": "Point", "coordinates":
[125.6, 10.1] }, "properties": { "name": "Dinagat Islands" } } http://geojson.org/
@Kartones IMPORT ISSUES
@Kartones Typical Huge files (>1GB) Lots of rows (+2M) Lots
of columns (~1600) XLS/XLSX -> CSV
@Kartones Typical Stream HTTP downloaded file Stream file between servers
Stream data import to DB
@Kartones Typical
@Kartones CartoDB-specific Content guessing (e.g. lat/lon) Type guessing Geometry errors
fixing Sync tables -> No downtime allowed
@Kartones DB-Specific Leave DB indexes as last step Prefer big
INSERT to multiple UPDATE GDAL’s ogr2ogr > Ruby/Python scripts http://www.gdal.org/ogr2ogr.html
@Kartones Questions? Thanks!
[email protected]