Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Public Python for the greater good by JD Bothma

Pycon ZA
October 06, 2016

Public Python for the greater good by JD Bothma

We at Code For South Africa use technology to facilitate promoting informed decision making for positive social change. This can mean generally being aware of what's going on, as well as deep critical research and analysis. We run the civic tech movement {code}bridge where people come and hack together or on their lonesome on civic tech projects. A quick summary will be given of some outputs of this community in Cape Town and Ethekwini.

We'll summarise some work we've done using mostly common python tools for the good of South African society. In particular I'll show how I've scraped and mirrored a government website on a tight budget at {code}bridge for better access to public information and seen usage pick up right after the local elections. We'll also show how a little bit of tech can empower citizens to hold government to account, and participate in the governing and development of our infrastructure. How presumably boring government notices really come to life when made accessible and personal.

This talk is aimed at anyone keen on making a big impact with a little bit of tech, and interested in improving lives. There are many low hanging fruit out there where lives can be improved with technology facilitating the necessary groundwork. I'd like to show you how easy it is to make an impact.

This is a heavily revised version of a talk given at DebConf 2016 with several new projects matured or launched since.

Pycon ZA

October 06, 2016
Tweet

More Decks by Pycon ZA

Other Decks in Programming

Transcript

  1. MFMA HTML... <A HREF="http://mfma.treasury.gov.za/Documents/Forms/AllItems.aspx?RootFolder=..." onclick="javascript:EnterFolder('http:\u002f\u002fmfma.treasury.gov.za\u002f...'); return false;" > 04. Service

    Delivery and Budget Implementation Plans </A> http://mfma.treasury.gov.za/Documents/Forms/AllItems.aspx? RootFolder=%2fDocuments%2f04%2e%20Service%20Delivery%20and%20Budget%20Implementatio n%20Plans&amp;FolderCTID=&amp;View=%7b84CA1A01%2dEF8A%2d4DE0%2d8DC4%2d47D223 CB5867%7d RootFolder=/Documents/04. Service Delivery and Budget Implementation Plans FolderCTID= View={84CA1A01-EF8A-4DE0-8DC4-47D223CB5867} {code}bridge
  2. Class MfmaSpider(scrapy.Spider): start_urls = ["http://mfma.treasury.gov.za"] def parse(self, response): for item

    in self.page_item(response): yield item def page_item(self, response): page_item = PageItem() title_css = '.breadcrumbCurrent' title = response.selector.css(title_css) page_item['title'] = title.xpath('text()')[0].extract() # Scrape content etc... yield page_item {code}bridge
  3. Scrapy Item - MFMA Page Item { "form_table_rows": [], "original_url":

    "http://mfma.treasury.gov.za/Return_Forms/...", "breadcrumbs": "<span><a " href=\"http://mfma.treasury.gov.za...", "title": "Return Forms", "path": "/Return_Forms/index.html", "type": "page", "body": "<div>\nAll Return Forms contain new demarcation codes ..." } {code}bridge
  4. https://mfmamirror.github.io 3 Resources on S3 ITEM_PIPELINES = { 'mfma.pipelines.DepagingPipeline': 100,

    'mfma.pipelines.FileArchivePipeline': 100, # 'mfma.pipelines.MirrorBuilderPipeline': 300, } {code}bridge
  5. Get Involved Play with your city’s data Show what’s possible

    Join/start a civic tech/open data group Take a sabbatical ( ͡º ͜ʖ ͡º) code4sa.org/careers