September 01, 2011
120

# Python for GIS...and then some

Intermediate Python for GIS course taught at the 2011 Arkansas GIS Users Forum meeting, Bentonville, AR

September 01, 2011

## Transcript

1. ### Python for GIS…and then some 2011 AR GIS User’s Forum

Conference Chad Cooper Center for Advanced Spatial Technologies University of Arkansas, Fayetteville
2. ### Intros • Name • What you do/where you work •

Used Python much? – Any formal training? – What do you use it for? • Know any other languages?
3. ### Objectives • Informal class – Ask questions, you will stump

me, but we will find an answer – Expect tangents • NOT geared totally to ArcGIS • Let’s cover some important basics • Python for accomplishing other tasks • THINK – oddball and out of the ordinary applications will make you want more…
4. ### (VERY Rough) Outline • Strings/operations • Lists, dictionaries, tuples, sets

• File input/output • The web – Fetching – Scraping – Email – FTP • Regular expressions • Logging • Excel • Exception handling • ArcGIS • Databases • Resources • SWAG!
5. ### Strings • Ordered collections of characters • Immutable • Raw

strings: path = r”C:\temp\chad\” • Slicing fruit[0] ‘b’ • Indexing: fruit[1:3] >> ‘an’ • Iteration/membership: for each in fruit ‘f’ in fruit • String formatting: ‘a %s parrot’ % ‘dead’ ‘a dead parrot’
6. ### Lists • List – ordered collection of arbitrary objects list1

= [0,1,2,3] list2 = ['zero','one','two','three'] list3 = [0,'zero',1,'one',2,'two',3,'three'] • Ordered list2.sort() list2.sort(reverse=True) ['one','three',...] ['zero','two',...] • Mutable – you can change it list1.append(4) list1.reverse() list2.insert(0,’one-half’) [0,1,2,3,4] [4,3,2,1,0] [‘one-half’,’zero’…] list2.extend([‘four’,’five’]) <- Extend concats lists
7. ### Lists… • Iterable – very important! for l in list3

0 zero ... • Membership 3 in list3 --> True • Nestable – 2D array/matrix list4 = [[0,1,2], [3,4,5], [6,7,8]] • Access by index – zero based list4[1] list4[1][2] [3,4,5] 5
8. ### Dictionaries • Unordered collection of arbitrary objects d = {1:’foo’,

2:’bar’} • Key/value pairs – think hash/lookup table (keys don’t have to be numbers) d.keys() d.values() [1, 2] [‘foo’,’bar’] • Nestable, mutable d[3] = ‘spam’ del d[key] • Access by key, not offset d[2] >> ‘bar’
9. ### Tuples • Ordered collection of arbitrary objects • Immutable –

cannot add, remove, find • Access by offset • Basically an unchangeable list (1,2,’three’,4,…) • So what’s the purpose? – FAST – great for iterating over constant set of values – SAFE – you can’t change it
10. ### Sets • Unordered collections of objects • Like mathematical sets

– collection of distinct objects – NO DUPLICATES • Example – get rid of dups in a list L1=[2,2,3,4,5,5,3] L2=[] [L2.append(x) for x in L1 if x not in L2] >>> L2 [2, 3, 4, 5]
11. ### List comprehensions • Map one list to another by applying

a function to each of the list elements • Original list goes unchanged L = [2,4,6,8] J = [elem * 2 for elem in L] >>> J [4, 8, 12, 16]
12. ### Files • Built in open function – slurp entire file

into memory – OK except for huge files data = open(file).read().splitlines() • Iterate over the lines for line in data: do something • CSV module reader = csv.reader(open('C:/file.csv','rb')) for line in reader: do something
13. ### Exercise 1 • Work with csv file (csv module) –

C:\temp\python\simple-csv.csv • Read into memory (create reader, open file) • Print it out, slice it up, use indexes • Put contents into a dictionary (zip module) • Put dictionary items into a list (list.append(dictionary item) • Exercise-1.py and Exercise-1B_Write_Text_File.py
14. ### The web • Infinite source of information • Right-click and

“Save as” is so lame • Python can help you exploit the web – ftplib, http (urllib), mechanize, scraping (Beautiful Soup), send email (smtplib)
15. ### Fetching data • Built-in libraries for ftp and http •

ftplib – log in, nav to directory, retrieve files • urllib/urllib2 – pass in the url you want, get it back • wget – GNU commandline tool – Can call with os.system()
16. ### Scraping • Scrape data from a web page • Well-structured

content is a HUGE help, as is valid markup, which isn’t always there • BeautifulSoup 3rd party module – Built in methods and regex’s help out – Great for getting at tables of data
17. ### Emailing • smtp built-in library • Best is you have

IP of your email server • Port blocking can be an issue import smtplib server = smtplib.SMTP(email_server_ip) msg = ‘All TPS reports need new cover sheets’ server.sendmail('[email protected]', '[email protected]', msg) server.quit() • There’s always Gmail too…
18. ### Exercise 2 • Go over a FTP example together •

Fetch some data from the web using urllib • Go to the AR GIS User’s Forum site and pull down the conference program pdf (urllib) • Exercise-2.py • BS_Scrape.py • Fetching_Data_Example.py • Fetching_Get_DRGs_Example.py
19. ### Regular Expressions • Powerful, standardized searching, replacing, and parsing of

text with complex patterns of characters • An incredibly complex topic • Simple ones can be sooooo helpful • re module in standard library * Patience required
20. ### Modularizing code • I’m lazy, so I want to reuse

code • import statement – call functionality in another module • Have one custom module (a .py file) with code you use all the time • Great way to package up helper functions • ESRI does this with ConversionUtils.py C:\Program Files (x86)\ArcGIS\Server10.0\ArcToolBox\Scripts
21. ### Excel • Love, hate, love • Many modules out there

– xlrd (read) / xlwt (write) – only .xls – openPyXL – read/write .xlsx • Uses – Push text data to Excel file – Push featureclass data to Excel programmatically – Read someone else’s “database”

23. ### Exception Handling • It’s necessary • Useful error reporting •

Proper application cleanup • Combine with logging try: do something... except: handle error... finally: clean up...
24. ### Logging • Log files can save you • Most code

runs in background, so you get no console output • Great for timing processing and debugging • Append or write • Environment: dev/test/prod • Two options: – logging module – Just write out to text file

26. ### Databases • You can connect to pretty much ANY database

• Is there one true solution?? • pyodbc – Access, SQL Server, MySQL • Oracle – cx_Oracle • Others – pymssql, _mssql, MySQLdb • Execute SQL statements through a connection conn = library.connect(driver/user/pwd) cursor = conn.cursor() ror row in cursor.execute(sql) …do something

28. ### ArcGIS • Python support continues to improve – Python support

for Label Expressions in 10.1!!! • ArcPy – rich native Python site-package – Successor to arcgisscripting • Organized in tools, functions, classes, modules • Very well documented, but daunting • Must have Python 2.6! At least according to ESRI…
29. ### Exercise • Use what we have learned • Nasty National

Bridge Inventory data – Fetch it – Parse it – Process it – Push to file geodatabase table – Push to file geodatabase featureclass • NBI_Data_Processing.py
30. ### Resources - FREE • Dive into Python • Python Cookbook

• Think Python • Python docs • gis.stackexchange.com • Google is your friend (as always) • Python community is HUGE and GIVING
31. ### Conferences • pyArkansas – October 22, UCA Conway • PyCon

– THE national US Python conference • FOSS4G – international open source for GIS • ESRI Developer Summit – major dork-fest, but great learning opportunity and Palm Springs in March
32. ### IDEs and editors • Wing – different license levels, good

people • Komodo – free version also available • Notepad2 – ole’ standby editor • Notepad++ - people swear by it • PythonWin – another standby • …dozens (at least) more editors out there…
33. ### Other fun stuff • Geopy • APIs – Flickr –

Google • XML processing • SWAG