Slide 1

Slide 1 text

Advanced geoprocessing with… MAGIC 2012 Chad Cooper – [email protected] Center for Advanced Spatial Technologies University of Arkansas, Fayetteville

Slide 2

Slide 2 text

Intros • your name • what you do/where you work • used Python much? – any formal training? – what do you use it for? • know any other languages?

Slide 3

Slide 3 text

Objectives • informal class – expect tangents – code as we go • not geared totally to ArcGIS • THINK – oddball and out of the ordinary applications will make you want more…

Slide 4

Slide 4 text

Outline • data types review • functions • procedural vs. OOP • geometries • rasters • spatial references • error handling/logging • documentation • 3rd party modules • module installation • the web –fetching –scraping –email –FTP • files

Slide 5

Slide 5 text

Strings • ordered collections of characters • immutable – can’t change it • raw strings: • slicing • indexing: • iteration/membership

Slide 6

Slide 6 text

Strings • string formatting: • useful string formatting:

Slide 7

Slide 7 text

Lists • list – ordered collection of arbitrary objects • ordered • mutable – you can change it <- Extend concats lists

Slide 8

Slide 8 text

Lists… • iterable – very important! • membership • nestable – 2D array/matrix • access by index – zero based

Slide 9

Slide 9 text

Dictionaries • unordered collection of arbitrary objects • key/value pairs – think hash/lookup table (keys don’t have to be numbers) • nestable, mutable • access by key, not offset

Slide 10

Slide 10 text

Dictionaries • iterable

Slide 11

Slide 11 text

Tuples • ordered collection of arbitrary objects • immutable – cannot add, remove, find • access by offset • basically an unchangeable list • so what’s the purpose? – FAST – great for iterating over constant set of values – SAFE – you can’t change it

Slide 12

Slide 12 text

List comprehensions • Map one list to another by applying a function to each of the list elements • Original list goes unchanged

Slide 13

Slide 13 text

Sets • unordered collections of objects • like mathematical sets – collection of distinct objects – NO DUPLICATES • example – get rid of dups in a list via list comp

Slide 14

Slide 14 text

Sets • get rid of dups via set: • union:

Slide 15

Slide 15 text

Sets • intersection – data are the same • symmetrical difference – data are not the same • difference – data in first set but not second

Slide 16

Slide 16 text

Programming paradigms: big blob of code • OK on a small scale for GP scripts • gets out of hand quickly • hard to follow • think ModelBuilder-exported code

Slide 17

Slide 17 text

Programming paradigms: procedural programming • basically a list of instructions • program is built from one or more procedures (functions) – reusable chunks • procedures called at anytime, anywhere in program • focus is to break task into collection of variables, data structures, subroutines • natural style, easy to understand • strict separation between code and data

Slide 18

Slide 18 text

Functions • portion of code within a larger program that performs a specific task • can be called anytime, anyplace • can accept arguments • should return a value • keeps code neat • promotes smooth flow

Slide 19

Slide 19 text

Functions

Slide 20

Slide 20 text

Programming paradigms: Procedural example

Slide 21

Slide 21 text

Programming paradigms: Object-oriented programming (OOP) • break program down into data types (classes) that associate behavior (methods) with data (members or attributes) • code becomes more abstract • data and functions for dealing with it are bound together in one object

Slide 22

Slide 22 text

Programming paradigms: Object-oriented programming (OOP)

Slide 23

Slide 23 text

• objects let you wrap complex processes, but present a simple interface to them • methods and attributes are encapsulated inside the object • methods and attributes are exposed to users • you can then update the object without breaking the interface • you can pass objects around - carefully Programming paradigms: Object-oriented programming (OOP)

Slide 24

Slide 24 text

Programming paradigms: OOP - Inheritance • classes can inherit attributes and methods • allows you to reuse and customize existing code inside a new class • you can override methods • you can add new methods to a class without modifying the existing class

Slide 25

Slide 25 text

Programming paradigms: OOP - Inheritance

Slide 26

Slide 26 text

Programming paradigms: OOP - Inheritance

Slide 27

Slide 27 text

Modularizing code • I’m lazy, so I want to reuse code • statement – call functionality in another module • Have one custom module (a .py file) with code you use all the time • Great way to package up helper functions • ESRI does this with ConversionUtils.py C:\Program Files (x86)\ArcGIS\Server10.0\ArcToolBox\Scripts

Slide 28

Slide 28 text

Geometries • heirarchy: – feature class is made of features – feature is made of parts – part is made of points • heirarchy in Pythonic terms: – part: – multipart polygon: – single part polygon with hole:

Slide 29

Slide 29 text

Reading geometry • accessed through the geometry object of a feature • example: describe_geometry_arcmap.py 1.open up SearchCursor 2.loop through rows 3.get geometry 4.print out X, Y

Slide 30

Slide 30 text

Reading geometry

Slide 31

Slide 31 text

Reading geometry

Slide 32

Slide 32 text

Reading geometry

Slide 33

Slide 33 text

Writing geometry • • point features are point objects, lines and polygons are arrays of point objects – • Geometry objects can be created using the Geometry, Mulitpoint, PointGeometry, Polygon, or Polyline classes

Slide 34

Slide 34 text

Writing geometry

Slide 35

Slide 35 text

Writing geometry

Slide 36

Slide 36 text

Rasters • class – raster object: variable that references a raster dataset – gives access to raster props • raster calculations – Map Algebra – – can cast to Raster object for calculations

Slide 37

Slide 37 text

Rasters

Slide 38

Slide 38 text

Spatial references • can get properties from • class • methods to create/edit spatial refs

Slide 39

Slide 39 text

Spatial references • class • methods to create/edit spatial refs

Slide 40

Slide 40 text

No content

Slide 41

Slide 41 text

Exception Handling • It’s necessary, stuff fails • Useful error reporting • Proper application cleanup • Combine it with logging

Slide 42

Slide 42 text

Exception handling – try/except • most basic form of error handling • wrap whole program or portions of code • use optional clause for cleanup –close open files –close database connections –check extensions back in

Slide 43

Slide 43 text

Exception handling

Slide 44

Slide 44 text

Exception handling

Slide 45

Slide 45 text

Exception handling - raise • allows you to force an exception to occur • can be used to alert of conditions

Slide 46

Slide 46 text

Exception handling - raise

Slide 47

Slide 47 text

Exception handling AddError and traceback • – returns GP-specific errors • – prints stack trace; determines precise location of error – good for larger, more complex programs

Slide 48

Slide 48 text

Exception handling – AddError and traceback

Slide 49

Slide 49 text

Logging • logging module • logging levels: – : detailed; for troubleshooting – : normal operation, statuses – : still working, but unexpected behavior – : more serious, some function not working – : program cannot continue

Slide 50

Slide 50 text

Super-basic logging

Slide 51

Slide 51 text

Super-basic logging to a log file

Slide 52

Slide 52 text

Super-basic logging to a log file

Slide 53

Slide 53 text

Meaningful logging • “customize” the logger • add in info-level message(s) to get logged • log our errors to log file • can get much more advanced, see the docs

Slide 54

Slide 54 text

Meaningful logging

Slide 55

Slide 55 text

Meaningful logging

Slide 56

Slide 56 text

Code documentation • Pythonic standards covered in PEPs 8 and 257 • help() • comments need to be worth it • name items well • be precise and compact • comments may be for you

Slide 57

Slide 57 text

Creating documentation • – built-in; used by help() – generate HTML on any module – kinda plain • – old, rumored to be dead – produces nicely formatted HTML – easy to install and use • Sphinx framework – “intelligent and beautiful documentation” – all the cool kids are using it (docs.python.org) – more involved to setup and use

Slide 58

Slide 58 text

Branching out

Slide 59

Slide 59 text

Installing packages

Slide 60

Slide 60 text

Installing packages (on Windows) • Windows executables • Python eggs – .zip file with metadata, renamed .egg – distributes code as a bundle – need easy_install • pip – tool for installing and managing Python packages – replacement for easy_install

Slide 61

Slide 61 text

pip • can take care of dependencies for you • uninstallation! • install via , ironically –

Slide 62

Slide 62 text

virtualenv • a tool to create isolated Python environments • manage dependencies on a per-project basis, rather than globally installing • test modules without installing into site- packages • avoid unintentional upgrades

Slide 63

Slide 63 text

virtualenv • install via pip, easy_install, or by • create the env • activate the env • use the env

Slide 64

Slide 64 text

virtualenv • installs Python where you tell it, modifies system path to point there – good only while the env is activated • use yolk to list installed packages in env • But can this work in ArcMap Python prompt?

Slide 65

Slide 65 text

virtualenv • YES, with a little work... • tells ArcMap to use Python interpreter in our virtualenv – kill ArcMap, back to using default interpreter

Slide 66

Slide 66 text

No content

Slide 67

Slide 67 text

The web • Infinite source of information • Right-click and “Save as” is so lame (and too much work) • Python can help you exploit the web – ftplib, http (urllib), mechanize, scraping (Beautiful Soup), send email (smtplib)

Slide 68

Slide 68 text

Fetching data • Built-in libraries for ftp and http • ftplib – log in, nav to directory, retrieve files • urllib/urllib2 – pass in the url you want, get it back • wget – GNU commandline tool – Can call with os.system()

Slide 69

Slide 69 text

Fetching data

Slide 70

Slide 70 text

Scraping • Scrape data from a web page • Well-structured content is a HUGE help, as is valid markup, which isn’t always there • BeautifulSoup 3rd party module – Built in methods and regex’s help out – Great for getting at tables of data

Slide 71

Slide 71 text

Scraping addresses http://www.phillypal.com/pal_locations.php

Slide 72

Slide 72 text

Scraping addresses

Slide 73

Slide 73 text

Scraping addresses

Slide 74

Slide 74 text

Emailing • smtp built-in library • best if you have IP of your email server • port blocking can be an issue • there’s always Gmail too…

Slide 75

Slide 75 text

Files • built in open function – slurp entire file into memory – OK except for huge files • iterate over the lines • CSV module

Slide 76

Slide 76 text

No content

Slide 77

Slide 77 text

Excel • love, hate, love • many modules out there – xlrd (read) / xlwt (write) – only .xls – openPyXL – read/write .xlsx • uses – Push text data to Excel file – Push featureclass data to Excel programmatically – Read someone else’s “database”

Slide 78

Slide 78 text

Reading Excel

Slide 79

Slide 79 text

Writing Excel

Slide 80

Slide 80 text

Writing Excel

Slide 81

Slide 81 text

Databases • You can connect to pretty much ANY database • Is there one true solution?? • pyodbc – Access, SQL Server, MySQL • Oracle – cx_Oracle • Others – pymssql, _mssql, MySQLdb • Execute SQL statements through a connection

Slide 82

Slide 82 text

Resources - FREE • Dive into Python • Python Cookbook • Think Python • Python docs • gis.stackexchange.com • Google is your friend (as always) • Python community is HUGE and GIVING

Slide 83

Slide 83 text

Conferences • pyArkansas – annually in Conway – pyar2 list on python.org • PyCon – THE national US Python conference • FOSS4G – international open source for GIS • ESRI Developer Summit – major dork-fest, but great learning opportunity and Palm Springs in March

Slide 84

Slide 84 text

IDEs and editors • Wing – different license levels, good people • PyScripter – open source, code completion • Komodo – free version also available • Notepad2 – ole’ standby editor • Notepad++ - people swear by it • PythonWin – another standby, but barebones • …dozens (at least) more editors out there…

Slide 85

Slide 85 text

More reading • http://www.voidspace.org.uk/python/articles/ OOP.shtml - great OOP article (which I used a a lot)

Slide 86

Slide 86 text

No content