Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Python for GIS...and then some

Chad Cooper
September 01, 2011

Python for GIS...and then some

Intermediate Python for GIS course taught at the 2011 Arkansas GIS Users Forum meeting, Bentonville, AR

Chad Cooper

September 01, 2011
Tweet

More Decks by Chad Cooper

Other Decks in Programming

Transcript

  1. Python for GIS…and then some
    2011 AR GIS User’s Forum Conference
    Chad Cooper
    Center for Advanced Spatial Technologies
    University of Arkansas, Fayetteville

    View Slide

  2. Intros
    • Name
    • What you do/where you work
    • Used Python much?
    – Any formal training?
    – What do you use it for?
    • Know any other languages?

    View Slide

  3. Objectives
    • Informal class
    – Ask questions, you will stump me, but we will find
    an answer
    – Expect tangents
    • NOT geared totally to ArcGIS
    • Let’s cover some important basics
    • Python for accomplishing other tasks
    • THINK – oddball and out of the ordinary
    applications will make you want more…

    View Slide

  4. (VERY Rough) Outline
    • Strings/operations
    • Lists, dictionaries,
    tuples, sets
    • File input/output
    • The web
    – Fetching
    – Scraping
    – Email
    – FTP
    • Regular expressions
    • Logging
    • Excel
    • Exception handling
    • ArcGIS
    • Databases
    • Resources
    • SWAG!

    View Slide

  5. Strings
    • Ordered collections of characters
    • Immutable
    • Raw strings: path = r”C:\temp\chad\”
    • Slicing fruit[0]
    ‘b’
    • Indexing: fruit[1:3] >> ‘an’
    • Iteration/membership: for each in fruit
    ‘f’ in fruit
    • String formatting: ‘a %s parrot’ % ‘dead’
    ‘a dead parrot’

    View Slide

  6. Lists
    • List – ordered collection of arbitrary objects
    list1 = [0,1,2,3]
    list2 = ['zero','one','two','three']
    list3 = [0,'zero',1,'one',2,'two',3,'three']
    • Ordered
    list2.sort() list2.sort(reverse=True)
    ['one','three',...] ['zero','two',...]
    • Mutable – you can change it
    list1.append(4) list1.reverse() list2.insert(0,’one-half’)
    [0,1,2,3,4] [4,3,2,1,0] [‘one-half’,’zero’…]
    list2.extend([‘four’,’five’])

    View Slide

  7. Lists…
    • Iterable – very important! for l in list3
    0
    zero ...
    • Membership 3 in list3 --> True
    • Nestable – 2D array/matrix
    list4 = [[0,1,2],
    [3,4,5],
    [6,7,8]]
    • Access by index – zero based
    list4[1] list4[1][2]
    [3,4,5] 5

    View Slide

  8. Dictionaries
    • Unordered collection of arbitrary objects
    d = {1:’foo’, 2:’bar’}
    • Key/value pairs – think hash/lookup table (keys
    don’t have to be numbers)
    d.keys() d.values()
    [1, 2] [‘foo’,’bar’]
    • Nestable, mutable
    d[3] = ‘spam’ del d[key]
    • Access by key, not offset
    d[2] >> ‘bar’

    View Slide

  9. Tuples
    • Ordered collection of arbitrary objects
    • Immutable – cannot add, remove, find
    • Access by offset
    • Basically an unchangeable list
    (1,2,’three’,4,…)
    • So what’s the purpose?
    – FAST – great for iterating over constant set of
    values
    – SAFE – you can’t change it

    View Slide

  10. Sets
    • Unordered collections of objects
    • Like mathematical sets – collection of distinct
    objects – NO DUPLICATES
    • Example – get rid of dups in a list
    L1=[2,2,3,4,5,5,3]
    L2=[]
    [L2.append(x) for x in L1 if x not in L2]
    >>> L2
    [2, 3, 4, 5]

    View Slide

  11. List comprehensions
    • Map one list to another by applying a function
    to each of the list elements
    • Original list goes unchanged
    L = [2,4,6,8]
    J = [elem * 2 for elem in L]
    >>> J
    [4, 8, 12, 16]

    View Slide

  12. Files
    • Built in open function – slurp entire file into
    memory – OK except for huge files
    data = open(file).read().splitlines()
    • Iterate over the lines
    for line in data:
    do something
    • CSV module
    reader = csv.reader(open('C:/file.csv','rb'))
    for line in reader:
    do something

    View Slide

  13. Exercise 1
    • Work with csv file (csv module)
    – C:\temp\python\simple-csv.csv
    • Read into memory (create reader, open file)
    • Print it out, slice it up, use indexes
    • Put contents into a dictionary (zip module)
    • Put dictionary items into a list
    (list.append(dictionary item)
    • Exercise-1.py and Exercise-1B_Write_Text_File.py

    View Slide

  14. The web
    • Infinite source of information
    • Right-click and “Save as” is so lame
    • Python can help you exploit the web
    – ftplib, http (urllib), mechanize, scraping (Beautiful
    Soup), send email (smtplib)

    View Slide

  15. Fetching data
    • Built-in libraries for ftp and http
    • ftplib – log in, nav to directory, retrieve files
    • urllib/urllib2 – pass in the url you want, get it
    back
    • wget – GNU commandline tool
    – Can call with os.system()

    View Slide

  16. Scraping
    • Scrape data from a web page
    • Well-structured content is a HUGE help, as is
    valid markup, which isn’t always there
    • BeautifulSoup 3rd party module
    – Built in methods and regex’s help out
    – Great for getting at tables of data

    View Slide

  17. Emailing
    • smtp built-in library
    • Best is you have IP of your email server
    • Port blocking can be an issue
    import smtplib
    server = smtplib.SMTP(email_server_ip)
    msg = ‘All TPS reports need new cover sheets’
    server.sendmail('[email protected]',
    '[email protected]',
    msg)
    server.quit()
    • There’s always Gmail too…

    View Slide

  18. Exercise 2
    • Go over a FTP example together
    • Fetch some data from the web using urllib
    • Go to the AR GIS User’s Forum site and pull
    down the conference program pdf (urllib)
    • Exercise-2.py
    • BS_Scrape.py
    • Fetching_Data_Example.py
    • Fetching_Get_DRGs_Example.py

    View Slide

  19. Regular Expressions
    • Powerful, standardized searching, replacing,
    and parsing of text with complex patterns of
    characters
    • An incredibly complex topic
    • Simple ones can be sooooo helpful
    • re module in standard library
    * Patience required

    View Slide

  20. Modularizing code
    • I’m lazy, so I want to reuse code
    • import statement – call functionality in
    another module
    • Have one custom module (a .py file) with code
    you use all the time
    • Great way to package up helper functions
    • ESRI does this with ConversionUtils.py
    C:\Program Files (x86)\ArcGIS\Server10.0\ArcToolBox\Scripts

    View Slide

  21. Excel
    • Love, hate, love
    • Many modules out there
    – xlrd (read) / xlwt (write) – only .xls
    – openPyXL – read/write .xlsx
    • Uses
    – Push text data to Excel file
    – Push featureclass data to Excel programmatically
    – Read someone else’s “database”

    View Slide

  22. Excel Examples
    • Excel-xlrd-Example.py - read
    • Excel-xlwt-Example.py - write

    View Slide

  23. Exception Handling
    • It’s necessary
    • Useful error reporting
    • Proper application cleanup
    • Combine with logging
    try:
    do something...
    except:
    handle error...
    finally:
    clean up...

    View Slide

  24. Logging
    • Log files can save you
    • Most code runs in background, so you get no
    console output
    • Great for timing processing and debugging
    • Append or write
    • Environment: dev/test/prod
    • Two options:
    – logging module
    – Just write out to text file

    View Slide

  25. Logging Examples
    • Logging_Example.py
    • Logging_Example_With_Exception_Handling.py

    View Slide

  26. Databases
    • You can connect to pretty much ANY database
    • Is there one true solution??
    • pyodbc – Access, SQL Server, MySQL
    • Oracle – cx_Oracle
    • Others – pymssql, _mssql, MySQLdb
    • Execute SQL statements through a connection
    conn = library.connect(driver/user/pwd)
    cursor = conn.cursor()
    ror row in cursor.execute(sql)
    …do something

    View Slide

  27. Database Example
    • Databases_pyodbc_Connect_To_Access_Example.py

    View Slide

  28. ArcGIS
    • Python support continues to improve
    – Python support for Label Expressions in 10.1!!!
    • ArcPy – rich native Python site-package
    – Successor to arcgisscripting
    • Organized in tools, functions, classes, modules
    • Very well documented, but daunting
    • Must have Python 2.6! At least according to ESRI…

    View Slide

  29. Exercise
    • Use what we have learned
    • Nasty National Bridge Inventory data
    – Fetch it
    – Parse it
    – Process it
    – Push to file geodatabase table
    – Push to file geodatabase featureclass
    • NBI_Data_Processing.py

    View Slide

  30. Resources - FREE
    • Dive into Python
    • Python Cookbook
    • Think Python
    • Python docs
    • gis.stackexchange.com
    • Google is your friend (as always)
    • Python community is HUGE and GIVING

    View Slide

  31. Conferences
    • pyArkansas – October 22, UCA Conway
    • PyCon – THE national US Python conference
    • FOSS4G – international open source for GIS
    • ESRI Developer Summit – major dork-fest, but
    great learning opportunity and Palm Springs in
    March

    View Slide

  32. IDEs and editors
    • Wing – different license levels, good people
    • Komodo – free version also available
    • Notepad2 – ole’ standby editor
    • Notepad++ - people swear by it
    • PythonWin – another standby
    • …dozens (at least) more editors out there…

    View Slide

  33. Other fun stuff
    • Geopy
    • APIs
    – Flickr
    – Google
    • XML processing
    • SWAG

    View Slide