Slide 1

Slide 1 text

VOTE USING PYTHON TO GET OUT THE Mike Pirnat ž @mpirnat ž mike.pirnat.com ž  ž  ž  ž  ž

Slide 2

Slide 2 text

Background

Slide 3

Slide 3 text

Objectives • Manage committee outreach to voters • Mailing list • Door-to-door list • Call lists by school • Minimize wasted contacts • Avoid redundant contacts • Repeatable; re-create database at any time

Slide 4

Slide 4 text

Sources of Data CSVs from Ohio Board of Elections: • List of registered voters • Absentee ballots

Slide 5

Slide 5 text

Sources of Data Excel spreadsheets (XLSX) from levy committee: • School directory data • School district staff list • Early childhood PTA list • Do not contact list

Slide 6

Slide 6 text

Technologies • Excel/Numbers • Python • Standard library: csv, datetime, re • Standard library: SQLite • SQLAlchemy • Alembic

Slide 7

Slide 7 text

Data Model

Slide 8

Slide 8 text

Voter CSV - name - BOE voter id - precinct - ward - address - party - status - birth year - ballot history Student CSV - name - address - parent names - phone - school Absentee Ballot CSV - name - BOE voter id - address - date requested - date returned - challenged PTA CSV - name - address - phone - email - position - kid info Staff CSV - name - school - position - address - phone - email Do Not Contact CSV - name - address

Slide 9

Slide 9 text

Voter CSV - name - BOE voter id - party - status - birth year - ballot history Student CSV - name - school Absentee Ballot CSV - name - BOE voter id - address - date requested - date returned - challenged PTA CSV - name - email - position - kid info Staff CSV - name - school - position - email Do Not Contact CSV - name - address - precinct - ward - address - address - address - address - phone - phone - phone - parent names

Slide 10

Slide 10 text

Voter CSV - name - BOE voter id - party - status - birth year - ballot history Student CSV - name - school Absentee Ballot CSV - name - BOE voter id - address - date requested - date returned - challenged PTA CSV - name - email - position - kid info Staff CSV - name - school - position - email Do Not Contact CSV - name - address - precinct - ward - address - address - address - address - phone - phone - phone - parent names Household - address - precinct - ward - parent names - phone - staff - pta - do not contact

Slide 11

Slide 11 text

Voter CSV - name - BOE voter id - precinct - ward - address - party - status - birth year - ballot history Student CSV - address - parent names - phone Absentee Ballot CSV - name - BOE voter id - address - date requested - date returned - challenged PTA CSV - name - address - phone - email - position - kid info Staff CSV - name - school - position - address - phone - email Do Not Contact CSV - name - address - name - school

Slide 12

Slide 12 text

Voter CSV - name - BOE voter id - precinct - ward - address - party - status - birth year - ballot history Student CSV - address - parent names - phone Absentee Ballot CSV - name - BOE voter id - address - date requested - date returned - challenged PTA CSV - name - address - phone - email - position - kid info Staff CSV - name - school - position - address - phone - email Do Not Contact CSV - name - address - name - school Student - name - school

Slide 13

Slide 13 text

Voter CSV - precinct - ward - address - birth year - ballot history Student CSV - name - address - parent names - phone - school Absentee Ballot CSV - name - address PTA CSV - name - address - phone - email - position - kid info Staff CSV - name - school - position - address - phone - email Do Not Contact CSV - name - address - name - party - status - BOE voter id - date requested - BOE voter id - date returned - challenged

Slide 14

Slide 14 text

Voter CSV - precinct - ward - address - birth year - ballot history Student CSV - name - address - parent names - phone - school Absentee Ballot CSV - name - address PTA CSV - name - address - phone - email - position - kid info Staff CSV - name - school - position - address - phone - email Do Not Contact CSV - name - address - name - party - status - BOE voter id - date requested - BOE voter id - date returned - challenged Voter - name - party - status - BOE voter id - absentee requested - absentee returned - absentee challenged

Slide 15

Slide 15 text

Voter CSV - name - BOE voter id - precinct - ward - address - party - status - birth year Student CSV - name - address - parent names - phone - school Absentee Ballot CSV - name - BOE voter id - address - date requested - date returned - challenged PTA CSV - name - address - phone - email - position - kid info Staff CSV - name - school - position - address - phone - email Do Not Contact CSV - name - address - ballot history

Slide 16

Slide 16 text

Voter CSV - name - BOE voter id - precinct - ward - address - party - status - birth year Student CSV - name - address - parent names - phone - school Absentee Ballot CSV - name - BOE voter id - address - date requested - date returned - challenged PTA CSV - name - address - phone - email - position - kid info Staff CSV - name - school - position - address - phone - email Do Not Contact CSV - name - address - ballot history Ballot - ballot type - election type - date

Slide 17

Slide 17 text

Household - address - precinct - ward - parent names - phone - staff - pta - do not contact Student - name - school Voter - name - party - status - BOE voter id - absentee requested - absentee returned - absentee challenged Ballot - type - date

Slide 18

Slide 18 text

Structure (Briefly)

Slide 19

Slide 19 text

levydb/ alembic/ data/ database.db levydb/ __init__.py db.py ingest_data.py models.py make_a_report.py make_another_report.py ...

Slide 20

Slide 20 text

Model Classes

Slide 21

Slide 21 text

# levydb/models.py... from sqlalchemy import Column, Integer, String, \ Boolean, Date, ForeignKey from sqlalchemy.orm import relationship, backref from sqlalchemy.ext.declarative import declarative_base Base = declarative_base()

Slide 22

Slide 22 text

# levydb/models.py... from sqlalchemy import Column, Integer, String, \ Boolean, Date, ForeignKey from sqlalchemy.orm import relationship, backref from sqlalchemy.ext.declarative import declarative_base Base = declarative_base()

Slide 23

Slide 23 text

# levydb/models.py... from sqlalchemy import Column, Integer, String, \ Boolean, Date, ForeignKey from sqlalchemy.orm import relationship, backref from sqlalchemy.ext.declarative import declarative_base Base = declarative_base()

Slide 24

Slide 24 text

# levydb/models.py... from sqlalchemy import Column, Integer, String, \ Boolean, Date, ForeignKey from sqlalchemy.orm import relationship, backref from sqlalchemy.ext.declarative import declarative_base Base = declarative_base()

Slide 25

Slide 25 text

class Household(Base): __tablename__ = 'households' id = Column(Integer, primary_key=True) precinct = Column(String) ward = Column(String) address = Column(String) mother_last_name = Column(String) mother_first_name = Column(String) father_last_name = Column(String) father_first_name = Column(String) phone = Column(String) do_not_contact = Column(Boolean) early_childhood_pta = Column(Boolean) staff = Column(Boolean) students = relationship('Student', order_by='Student.id', backref='household', cascade='all, delete, delete-orphan') voters = relationship('Voter', order_by='Voter.id', backref='household', cascade='all, delete, delete-orphan')

Slide 26

Slide 26 text

class Household(Base): __tablename__ = 'households' id = Column(Integer, primary_key=True) precinct = Column(String) ward = Column(String) address = Column(String) mother_last_name = Column(String) mother_first_name = Column(String) father_last_name = Column(String) father_first_name = Column(String) phone = Column(String) do_not_contact = Column(Boolean) early_childhood_pta = Column(Boolean) staff = Column(Boolean) students = relationship('Student', order_by='Student.id', backref='household', cascade='all, delete, delete-orphan') voters = relationship('Voter', order_by='Voter.id', backref='household', cascade='all, delete, delete-orphan')

Slide 27

Slide 27 text

class Household(Base): __tablename__ = 'households' id = Column(Integer, primary_key=True) precinct = Column(String) ward = Column(String) address = Column(String) mother_last_name = Column(String) mother_first_name = Column(String) father_last_name = Column(String) father_first_name = Column(String) phone = Column(String) do_not_contact = Column(Boolean) early_childhood_pta = Column(Boolean) staff = Column(Boolean) students = relationship('Student', order_by='Student.id', backref='household', cascade='all, delete, delete-orphan') voters = relationship('Voter', order_by='Voter.id', backref='household', cascade='all, delete, delete-orphan')

Slide 28

Slide 28 text

class Household(Base): __tablename__ = 'households' id = Column(Integer, primary_key=True) precinct = Column(String) ward = Column(String) address = Column(String) mother_last_name = Column(String) mother_first_name = Column(String) father_last_name = Column(String) father_first_name = Column(String) phone = Column(String) do_not_contact = Column(Boolean) early_childhood_pta = Column(Boolean) staff = Column(Boolean) students = relationship('Student', order_by='Student.id', backref='household', cascade='all, delete, delete-orphan') voters = relationship('Voter', order_by='Voter.id', backref='household', cascade='all, delete, delete-orphan')

Slide 29

Slide 29 text

class Voter(Base): __tablename__ = 'voters' id = Column(Integer, primary_key=True) boe_voter_id = Column(Integer) last_name = Column(String) first_name = Column(String) middle_name = Column(String) suffix = Column(String) birth_year = Column(Integer) party = Column(String) status = Column(String) absentee_ballot_requested = Column(Boolean) absentee_ballot_returned = Column(Date) absentee_ballot_challenged = Column(String) household_id = Column(Integer, ForeignKey('households.id')) ballots = relationship('Ballot', order_by='Ballot.id', backref='voter', cascade='all, delete, delete-orphan')

Slide 30

Slide 30 text

class Voter(Base): __tablename__ = 'voters' id = Column(Integer, primary_key=True) boe_voter_id = Column(Integer) last_name = Column(String) first_name = Column(String) middle_name = Column(String) suffix = Column(String) birth_year = Column(Integer) party = Column(String) status = Column(String) absentee_ballot_requested = Column(Boolean) absentee_ballot_returned = Column(Date) absentee_ballot_challenged = Column(String) household_id = Column(Integer, ForeignKey('households.id')) ballots = relationship('Ballot', order_by='Ballot.id', backref='voter', cascade='all, delete, delete-orphan')

Slide 31

Slide 31 text

class Voter(Base): __tablename__ = 'voters' id = Column(Integer, primary_key=True) boe_voter_id = Column(Integer) last_name = Column(String) first_name = Column(String) middle_name = Column(String) suffix = Column(String) birth_year = Column(Integer) party = Column(String) status = Column(String) absentee_ballot_requested = Column(Boolean) absentee_ballot_returned = Column(Date) absentee_ballot_challenged = Column(String) household_id = Column(Integer, ForeignKey('households.id')) ballots = relationship('Ballot', order_by='Ballot.id', backref='voter', cascade='all, delete, delete-orphan')

Slide 32

Slide 32 text

class Voter(Base): __tablename__ = 'voters' id = Column(Integer, primary_key=True) boe_voter_id = Column(Integer) last_name = Column(String) first_name = Column(String) middle_name = Column(String) suffix = Column(String) birth_year = Column(Integer) party = Column(String) status = Column(String) absentee_ballot_requested = Column(Boolean) absentee_ballot_returned = Column(Date) absentee_ballot_challenged = Column(String) household_id = Column(Integer, ForeignKey('households.id')) ballots = relationship('Ballot', order_by='Ballot.id', backref='voter', cascade='all, delete, delete-orphan')

Slide 33

Slide 33 text

class Voter(Base): __tablename__ = 'voters' id = Column(Integer, primary_key=True) boe_voter_id = Column(Integer) last_name = Column(String) first_name = Column(String) middle_name = Column(String) suffix = Column(String) birth_year = Column(Integer) party = Column(String) status = Column(String) absentee_ballot_requested = Column(Boolean) absentee_ballot_returned = Column(Date) absentee_ballot_challenged = Column(String) household_id = Column(Integer, ForeignKey('households.id')) ballots = relationship('Ballot', order_by='Ballot.id', backref='voter', cascade='all, delete, delete-orphan')

Slide 34

Slide 34 text

class Ballot(Base): __tablename__ = 'ballots' id = Column(Integer, primary_key=True) ballot_type = Column(String) election_type = Column(String) election_date = Column(Date) voter_id = Column(Integer, ForeignKey('voters.id'))

Slide 35

Slide 35 text

class Student(Base): __tablename__ = 'students' id = Column(Integer, primary_key=True) last_name = Column(String) first_name = Column(String) school_name = Column(String) household_id = Column(Integer, ForeignKey('households.id'))

Slide 36

Slide 36 text

Creating the Database

Slide 37

Slide 37 text

$ alembic init alembic

Slide 38

Slide 38 text

$ alembic init alembic => alembic/ README env.py script.py.mako versions/ alembic.ini

Slide 39

Slide 39 text

# in alembic.ini... sqlalchemy.url = sqlite:////Users/mike/code/levydb/database.db # in alembic/env.py... from levydb.models import Base target_metadata = Base.metadata

Slide 40

Slide 40 text

$ alembic revision -m "create model"

Slide 41

Slide 41 text

$ alembic revision -m "create model" => alembic/versions/_create_model.py

Slide 42

Slide 42 text

$ alembic revision -m "create model" => alembic/versions/_create_model.py $ alembic upgrade OR $ alembic upgrade head

Slide 43

Slide 43 text

Ingestion

Slide 44

Slide 44 text

# In levydb/ingest_data.py... def ingest_voter_data(): ... def ingest_student_data(): ... def ingest_staff_data(): ... def ingest_pta_data(): ... def ingest_do_not_contact_data(): ...

Slide 45

Slide 45 text

Connecting to the Database

Slide 46

Slide 46 text

# In levydb/db.py... from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker from levydb import config Session = sessionmaker() engine = create_engine(config.DB_FILE, echo=config.DB_ECHO) Session.configure(bind=engine) session = Session() # In levydb/ingest_data.py... from levydb.db import engine, session

Slide 47

Slide 47 text

# In levydb/db.py... from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker from levydb import config Session = sessionmaker() engine = create_engine(config.DB_FILE, echo=config.DB_ECHO) Session.configure(bind=engine) session = Session() # In levydb/ingest_data.py... from levydb.db import engine, session

Slide 48

Slide 48 text

# In levydb/db.py... from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker from levydb import config Session = sessionmaker() engine = create_engine(config.DB_FILE, echo=config.DB_ECHO) Session.configure(bind=engine) session = Session() # In levydb/ingest_data.py... from levydb.db import engine, session

Slide 49

Slide 49 text

# In levydb/db.py... from sqlalchemy import create_engine from sqlalchemy.orm import sessionmaker from levydb import config Session = sessionmaker() engine = create_engine(config.DB_FILE, echo=config.DB_ECHO) Session.configure(bind=engine) session = Session() # In levydb/ingest_data.py... from levydb.db import engine, session

Slide 50

Slide 50 text

Reading CSVs

Slide 51

Slide 51 text

import csv with open("some_data.csv") as f: reader = csv.reader(f) header_row = reader.next() for row in reader: value0 = row[0] value1 = row[1] ...

Slide 52

Slide 52 text

import csv with open("some_data.csv") as f: reader = csv.reader(f) header_row = reader.next() for row in reader: value0 = row[0] value1 = row[1] ...

Slide 53

Slide 53 text

import csv with open("some_data.csv") as f: reader = csv.reader(f) header_row = reader.next() for row in reader: value0 = row[0] value1 = row[1] ...

Slide 54

Slide 54 text

import csv with open("some_data.csv") as f: reader = csv.reader(f) header_row = reader.next() for row in reader: value0 = row[0] value1 = row[1] ...

Slide 55

Slide 55 text

Inserting Data with SQLAlchemy

Slide 56

Slide 56 text

for row in reader: ... model_object = ModelObject( attr0=value0, attr1=value1, ...) session.add(model_object) session.commit()

Slide 57

Slide 57 text

for row in reader: ... model_object = ModelObject( attr0=value0, attr1=value1, ...) session.add(model_object) session.commit()

Slide 58

Slide 58 text

for row in reader: ... model_object = ModelObject( attr0=value0, attr1=value1, ...) session.add(model_object) session.commit()

Slide 59

Slide 59 text

for row in reader: ... model_object = ModelObject( attr0=value0, attr1=value1, ...) session.add(model_object) session.commit()

Slide 60

Slide 60 text

Ballot Data

Slide 61

Slide 61 text

Election type and date are encoded as header row titles: ...,G_11_04_2014,S_08_05_2014,P_05-06-2014,... The ballot type voted in those elections is specified in each row and can indicate a general or party-specific ballot (eg, in primaries): ...,Y,,D,... Ballot Data

Slide 62

Slide 62 text

Election type and date are encoded as header row titles: ...,G_11_04_2014,S_08_05_2014,P_05-06-2014,... The ballot type voted in those elections is specified in each row and can indicate a general or party-specific ballot (eg, in primaries): ...,Y,,D,... Ballot Data

Slide 63

Slide 63 text

Election type and date are encoded as header row titles: ...,G_11_04_2014,S_08_05_2014,P_05-06-2014,... The ballot type voted in those elections is specified in each row and can indicate a general or party-specific ballot (eg, in primaries): ...,Y,,D,... Ballot Data

Slide 64

Slide 64 text

Election type and date are encoded as header row titles: ...,G_11_04_2014,S_08_05_2014,P_05-06-2014,... The ballot type voted in those elections is specified in each row and can indicate a general or party-specific ballot (eg, in primaries): ...,Y,,D,... Ballot Data

Slide 65

Slide 65 text

Election type and date are encoded as header row titles: ...,G_11_04_2014,S_08_05_2014,P_05-06-2014,... The ballot type voted in those elections is specified in each row and can indicate a general or party-specific ballot (eg, in primaries): ...,Y,,D,... Ballot Data

Slide 66

Slide 66 text

Election type and date are encoded as header row titles: ...,G_11_04_2014,S_08_05_2014,P_05-06-2014,... The ballot type voted in those elections is specified in each row and can indicate a general or party-specific ballot (eg, in primaries): ...,Y,,D,... Ballot Data

Slide 67

Slide 67 text

Election type and date are encoded as header row titles: ...,G_11_04_2014,S_08_05_2014,P_05-06-2014,... The ballot type voted in those elections is specified in each row and can indicate a general or party-specific ballot (eg, in primaries): ...,Y,,D,... Ballot Data

Slide 68

Slide 68 text

# Get ballot data for election_title, ballot_type in zip(header_row[18:65], row[18:65]): if not ballot_type: continue election_type, election_date = parse_election_title(election_title) ballot = Ballot( ballot_type=ballot_type, election_type=election_type, election_date=election_date, voter_id=voter.id) session.add(ballot) session.commit() def parse_election_title(title): election_type, month, day, year = title.replace('-', '_').split('_') election_date = datetime.date(int(year), int(month), int(day)) return election_type, election_date

Slide 69

Slide 69 text

# Get ballot data for election_title, ballot_type in zip(header_row[18:65], row[18:65]): if not ballot_type: continue election_type, election_date = parse_election_title(election_title) ballot = Ballot( ballot_type=ballot_type, election_type=election_type, election_date=election_date, voter_id=voter.id) session.add(ballot) session.commit() def parse_election_title(title): election_type, month, day, year = title.replace('-', '_').split('_') election_date = datetime.date(int(year), int(month), int(day)) return election_type, election_date

Slide 70

Slide 70 text

# Get ballot data for election_title, ballot_type in zip(header_row[18:65], row[18:65]): if not ballot_type: continue election_type, election_date = parse_election_title(election_title) ballot = Ballot( ballot_type=ballot_type, election_type=election_type, election_date=election_date, voter_id=voter.id) session.add(ballot) session.commit() def parse_election_title(title): election_type, month, day, year = title.replace('-', '_').split('_') election_date = datetime.date(int(year), int(month), int(day)) return election_type, election_date

Slide 71

Slide 71 text

# Get ballot data for election_title, ballot_type in zip(header_row[18:65], row[18:65]): if not ballot_type: continue election_type, election_date = parse_election_title(election_title) ballot = Ballot( ballot_type=ballot_type, election_type=election_type, election_date=election_date, voter_id=voter.id) session.add(ballot) session.commit() def parse_election_title(title): election_type, month, day, year = title.replace('-', '_').split('_') election_date = datetime.date(int(year), int(month), int(day)) return election_type, election_date

Slide 72

Slide 72 text

# Get ballot data for election_title, ballot_type in zip(header_row[18:65], row[18:65]): if not ballot_type: continue election_type, election_date = parse_election_title(election_title) ballot = Ballot( ballot_type=ballot_type, election_type=election_type, election_date=election_date, voter_id=voter.id) session.add(ballot) session.commit() def parse_election_title(title): election_type, month, day, year = title.replace('-', '_').split('_') election_date = datetime.date(int(year), int(month), int(day)) return election_type, election_date

Slide 73

Slide 73 text

# Get ballot data for election_title, ballot_type in zip(header_row[18:65], row[18:65]): if not ballot_type: continue election_type, election_date = parse_election_title(election_title) ballot = Ballot( ballot_type=ballot_type, election_type=election_type, election_date=election_date, voter_id=voter.id) session.add(ballot) session.commit() def parse_election_title(title): election_type, month, day, year = title.replace('-', '_').split('_') election_date = datetime.date(int(year), int(month), int(day)) return election_type, election_date

Slide 74

Slide 74 text

# Get ballot data for election_title, ballot_type in zip(header_row[18:65], row[18:65]): if not ballot_type: continue election_type, election_date = parse_election_title(election_title) ballot = Ballot( ballot_type=ballot_type, election_type=election_type, election_date=election_date, voter_id=voter.id) session.add(ballot) session.commit() def parse_election_title(title): election_type, month, day, year = title.replace('-', '_').split('_') election_date = datetime.date(int(year), int(month), int(day)) return election_type, election_date

Slide 75

Slide 75 text

Normalizing & Linking Data

Slide 76

Slide 76 text

Board of Elections: School Directory: house_no pre_dir street apartment zip 221 N BAKER ST B1 12345 Data != Data Mailing Street Mailing Zip 221 North Baker Street APT #B-1 12345

Slide 77

Slide 77 text

def normalize_address(address): # Convert to uppercase address = address.upper() # Remove non-alphanumerics/non-space address = re.sub(r'[^A-Z0-9 ]', '', address) # Condense whitespace address = re.sub(r'\s+', ' ', address) # Replace things for pattern, replacement in address_replacements: address = re.sub(pattern, replacement, address) # Condense whitespace again in case we messed it up along the way address = re.sub(r'\s+', ' ', address) # Strip again in case we added extra whitespace somehow address = address.strip() return address

Slide 78

Slide 78 text

def normalize_address(address): # Convert to uppercase address = address.upper() # Remove non-alphanumerics/non-space address = re.sub(r'[^A-Z0-9 ]', '', address) # Condense whitespace address = re.sub(r'\s+', ' ', address) # Replace things for pattern, replacement in address_replacements: address = re.sub(pattern, replacement, address) # Condense whitespace again in case we messed it up along the way address = re.sub(r'\s+', ' ', address) # Strip again in case we added extra whitespace somehow address = address.strip() return address

Slide 79

Slide 79 text

def normalize_address(address): # Convert to uppercase address = address.upper() # Remove non-alphanumerics/non-space address = re.sub(r'[^A-Z0-9 ]', '', address) # Condense whitespace address = re.sub(r'\s+', ' ', address) # Replace things for pattern, replacement in address_replacements: address = re.sub(pattern, replacement, address) # Condense whitespace again in case we messed it up along the way address = re.sub(r'\s+', ' ', address) # Strip again in case we added extra whitespace somehow address = address.strip() return address

Slide 80

Slide 80 text

def normalize_address(address): # Convert to uppercase address = address.upper() # Remove non-alphanumerics/non-space address = re.sub(r'[^A-Z0-9 ]', '', address) # Condense whitespace address = re.sub(r'\s+', ' ', address) # Replace things for pattern, replacement in address_replacements: address = re.sub(pattern, replacement, address) # Condense whitespace again in case we messed it up along the way address = re.sub(r'\s+', ' ', address) # Strip again in case we added extra whitespace somehow address = address.strip() return address

Slide 81

Slide 81 text

def normalize_address(address): # Convert to uppercase address = address.upper() # Remove non-alphanumerics/non-space address = re.sub(r'[^A-Z0-9 ]', '', address) # Condense whitespace address = re.sub(r'\s+', ' ', address) # Replace things for pattern, replacement in address_replacements: address = re.sub(pattern, replacement, address) # Condense whitespace again in case we messed it up along the way address = re.sub(r'\s+', ' ', address) # Strip again in case we added extra whitespace somehow address = address.strip() return address

Slide 82

Slide 82 text

def normalize_address(address): # Convert to uppercase address = address.upper() # Remove non-alphanumerics/non-space address = re.sub(r'[^A-Z0-9 ]', '', address) # Condense whitespace address = re.sub(r'\s+', ' ', address) # Replace things for pattern, replacement in address_replacements: address = re.sub(pattern, replacement, address) # Condense whitespace again in case we messed it up along the way address = re.sub(r'\s+', ' ', address) # Strip again in case we added extra whitespace somehow address = address.strip() return address

Slide 83

Slide 83 text

def normalize_address(address): # Convert to uppercase address = address.upper() # Remove non-alphanumerics/non-space address = re.sub(r'[^A-Z0-9 ]', '', address) # Condense whitespace address = re.sub(r'\s+', ' ', address) # Replace things for pattern, replacement in address_replacements: address = re.sub(pattern, replacement, address) # Condense whitespace again in case we messed it up along the way address = re.sub(r'\s+', ' ', address) # Strip again in case we added extra whitespace somehow address = address.strip() return address

Slide 84

Slide 84 text

address_replacements = [ ('APT', ''), ('SUITE', ''), (' EAST ', ' E '), (' WEST ', ' W '), (' NORTH ', ' N '), (' SOUTH ', ' S '), (r'(\D) ROAD', r'\1 RD'), (r'(\D) AVENUE', r'\1 AVE'), (r'(\D) DRIVE', r'\1 DR'), (r'(\D) CIRCLE', r'\1 CIR'), (r'(\D) TRAIL', r'\1 TRL'), (r'(\D) BOULEVARD', r'\1 BLVD'), (r'(\D) LANE', r'\1 LN'), (r'(\D) COURT', r'\1 CT'), (r'(\D) PARKWAY', r'\1 PKWY'), (r'(\D) PKY', r'\1 PKWY'), (r'(\D) MEADOWS', r'\1 MDWS'), (r'(\D) PLACE', r'\1 PL'), (r'(\D) ROWE', r'\1 ROW'), (r'(\D) CROSSING$', r'\1 XING'), ..., (r'(\s\D) (\d+)$', r'\1\2'), ]

Slide 85

Slide 85 text

# When ingesting voter data for row in reader: ... # Normalize address raw_address = ' '.join( [house_number, direction_prefix, street, apartment]) address = normalize_address(raw_address) # Try to match to an existing household household = session.query(Household).filter_by(address=address).first() if household: voter.household_id = household.id else: # Create new household ...

Slide 86

Slide 86 text

# When ingesting voter data for row in reader: ... # Normalize address raw_address = ' '.join( [house_number, direction_prefix, street, apartment]) address = normalize_address(raw_address) # Try to match to an existing household household = session.query(Household).filter_by(address=address).first() if household: voter.household_id = household.id else: # Create new household ...

Slide 87

Slide 87 text

# When ingesting voter data for row in reader: ... # Normalize address raw_address = ' '.join( [house_number, direction_prefix, street, apartment]) address = normalize_address(raw_address) # Try to match to an existing household household = session.query(Household).filter_by(address=address).first() if household: voter.household_id = household.id else: # Create new household ...

Slide 88

Slide 88 text

# When ingesting voter data for row in reader: ... # Normalize address raw_address = ' '.join( [house_number, direction_prefix, street, apartment]) address = normalize_address(raw_address) # Try to match to an existing household household = session.query(Household).filter_by(address=address).first() if household: voter.household_id = household.id else: # Create new household ...

Slide 89

Slide 89 text

# When ingesting student data... for row in reader: ... # Attach students to households address = normalize_address(raw_address) household = session.query(Household).filter_by(address=address).first() if household: student.household_id = household.id if mother_last_name and not household.mother_last_name: household.mother_last_name = mother_last_name if mother_first_name and not household.mother_first_name: household.mother_first_name = mother_first_name ... if phone and not household.phone: household.phone = phone else: # Create new household ...

Slide 90

Slide 90 text

# When ingesting student data... for row in reader: ... # Attach students to households address = normalize_address(raw_address) household = session.query(Household).filter_by(address=address).first() if household: student.household_id = household.id if mother_last_name and not household.mother_last_name: household.mother_last_name = mother_last_name if mother_first_name and not household.mother_first_name: household.mother_first_name = mother_first_name ... if phone and not household.phone: household.phone = phone else: # Create new household ...

Slide 91

Slide 91 text

# When ingesting student data... for row in reader: ... # Attach students to households address = normalize_address(raw_address) household = session.query(Household).filter_by(address=address).first() if household: student.household_id = household.id if mother_last_name and not household.mother_last_name: household.mother_last_name = mother_last_name if mother_first_name and not household.mother_first_name: household.mother_first_name = mother_first_name ... if phone and not household.phone: household.phone = phone else: # Create new household ...

Slide 92

Slide 92 text

# When ingesting staff, early childhood PTA, do not contact data... for row in reader: ... address = normalize_address(raw_address) household = session.query(Household).filter_by(address=address).first() if not household: continue ...

Slide 93

Slide 93 text

# When ingesting staff, early childhood PTA, do not contact data... for row in reader: ... address = normalize_address(raw_address) household = session.query(Household).filter_by(address=address).first() if not household: continue ...

Slide 94

Slide 94 text

# When ingesting staff, early childhood PTA, do not contact data... for row in reader: ... address = normalize_address(raw_address) household = session.query(Household).filter_by(address=address).first() if not household: continue ...

Slide 95

Slide 95 text

Updating Data with SQLAlchemy

Slide 96

Slide 96 text

A single model object: voter.absentee_ballot_requested = True session.commit() Many rows, without a model object: session.query(Voter).update({ 'absentee_ballot_requested': None, 'absentee_ballot_returned': None, 'absentee_ballot_challenged': None, }) session.commit() Updating the Database

Slide 97

Slide 97 text

Extraction

Slide 98

Slide 98 text

#!/bin/env python def get_data(...): ... def write_csv(...): ... def main(): data = get_data() write_csv(data) if __name__ == "__main__": main()

Slide 99

Slide 99 text

Querying with SQLAlchemy

Slide 100

Slide 100 text

model_object = session.query(ModelObject).first() model_objects = session.query(ModelObject).all() model_objects = session.query(ModelObject)[42:2112] model_object = session.query(ModelObject).\ filter(ModelObject.attribute == value).first() model_objects = session.query(ModelObject).\ filter(ModelObject.attribute == value).\ group_by(ModelObject.other_attribute).\ order_by(ModelObject.third_attribute).\ all()

Slide 101

Slide 101 text

model_object = session.query(ModelObject).first() model_objects = session.query(ModelObject).all() model_objects = session.query(ModelObject)[42:2112] model_object = session.query(ModelObject).\ filter(ModelObject.attribute == value).first() model_objects = session.query(ModelObject).\ filter(ModelObject.attribute == value).\ group_by(ModelObject.other_attribute).\ order_by(ModelObject.third_attribute).\ all()

Slide 102

Slide 102 text

model_object = session.query(ModelObject).first() model_objects = session.query(ModelObject).all() model_objects = session.query(ModelObject)[42:2112] model_object = session.query(ModelObject).\ filter(ModelObject.attribute == value).first() model_objects = session.query(ModelObject).\ filter(ModelObject.attribute == value).\ group_by(ModelObject.other_attribute).\ order_by(ModelObject.third_attribute).\ all() Offset Limit

Slide 103

Slide 103 text

model_object = session.query(ModelObject).first() model_objects = session.query(ModelObject).all() model_objects = session.query(ModelObject)[42:2112] model_object = session.query(ModelObject).\ filter(ModelObject.attribute == value).first() model_objects = session.query(ModelObject).\ filter(ModelObject.attribute == value).\ group_by(ModelObject.other_attribute).\ order_by(ModelObject.third_attribute).\ all()

Slide 104

Slide 104 text

model_object = session.query(ModelObject).first() model_objects = session.query(ModelObject).all() model_objects = session.query(ModelObject)[42:2112] model_object = session.query(ModelObject).\ filter(ModelObject.attribute == value).first() model_objects = session.query(ModelObject).\ filter(ModelObject.attribute == value).\ group_by(ModelObject.other_attribute).\ order_by(ModelObject.third_attribute).\ all()

Slide 105

Slide 105 text

A simple example: household = session.query(Household).\ filter_by(address=address).first() A more complex example: voters = session.query(Voter).join(Household).\ filter(Household.do_not_contact == None).\ filter(Voter.status.in_(allowed_statuses)).\ group_by(Voter.household_id).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() More Specifically...

Slide 106

Slide 106 text

Writing CSVs

Slide 107

Slide 107 text

def write_csv(model_objects): with open("report.csv", "w") as f: writer = csv.writer(f) header_row = ["attr0", "attr1", ...] writer.writerow(header_row) for model_object in model_objects: row = [model_object.attr0, model_object.attr1, ...] writer.writerow(row)

Slide 108

Slide 108 text

def write_csv(model_objects): with open("report.csv", "w") as f: writer = csv.writer(f) header_row = ["attr0", "attr1", ...] writer.writerow(header_row) for model_object in model_objects: row = [model_object.attr0, model_object.attr1, ...] writer.writerow(row)

Slide 109

Slide 109 text

def write_csv(model_objects): with open("report.csv", "w") as f: writer = csv.writer(f) header_row = ["attr0", "attr1", ...] writer.writerow(header_row) for model_object in model_objects: row = [model_object.attr0, model_object.attr1, ...] writer.writerow(row)

Slide 110

Slide 110 text

def write_csv(model_objects): with open("report.csv", "w") as f: writer = csv.writer(f) header_row = ["attr0", "attr1", ...] writer.writerow(header_row) for model_object in model_objects: row = [model_object.attr0, model_object.attr1, ...] writer.writerow(row)

Slide 111

Slide 111 text

Reports

Slide 112

Slide 112 text

Targeting Strategies

Slide 113

Slide 113 text

Active Voters • Registered voter • Voter status is active or provisional

Slide 114

Slide 114 text

Recent Voters • Have voted in a recent election... • That’s a non-Presidential election

Slide 115

Slide 115 text

School-Affiliated Voters • Kids enrolled in a school • Parents in early childhood PTA • School district staff

Slide 116

Slide 116 text

Counts Become Costs Total Active Recent School Affiliated Voters 23470 20648 11486 3748 Households 12764 11423 7238 1847

Slide 117

Slide 117 text

Counts Become Costs Total Active Recent School Affiliated Voters 23470 20648 11486 3748 Households 12764 11423 7238 1847 4500?

Slide 118

Slide 118 text

Mailing List

Slide 119

Slide 119 text

Mailing List • Households... • With a registered, active voter... • Who voted in May 2014... • Or who voted in November 2014 and has a student in school... • That aren’t flagged do-not-call

Slide 120

Slide 120 text

allowed_statuses = [VoterStatus.ACTIVE, VoterStatus.PROVISIONAL] def get_voters(): date1 = datetime.date(2014, 5, 6) date2 = datetime.date(2014, 11, 4) voters = session.query(Voter).join(Ballot).\ join(Household).outerjoin(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter(Household.do_not_contact == None).\ filter( or_( Ballot.election_date == date1, and_( Ballot.election_date == date2, Student.last_name != None ) )).\ group_by(Voter.household_id).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() return voters

Slide 121

Slide 121 text

allowed_statuses = [VoterStatus.ACTIVE, VoterStatus.PROVISIONAL] def get_voters(): date1 = datetime.date(2014, 5, 6) date2 = datetime.date(2014, 11, 4) voters = session.query(Voter).join(Ballot).\ join(Household).outerjoin(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter(Household.do_not_contact == None).\ filter( or_( Ballot.election_date == date1, and_( Ballot.election_date == date2, Student.last_name != None ) )).\ group_by(Voter.household_id).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() return voters

Slide 122

Slide 122 text

allowed_statuses = [VoterStatus.ACTIVE, VoterStatus.PROVISIONAL] def get_voters(): date1 = datetime.date(2014, 5, 6) date2 = datetime.date(2014, 11, 4) voters = session.query(Voter).join(Ballot).\ join(Household).outerjoin(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter(Household.do_not_contact == None).\ filter( or_( Ballot.election_date == date1, and_( Ballot.election_date == date2, Student.last_name != None ) )).\ group_by(Voter.household_id).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() return voters

Slide 123

Slide 123 text

allowed_statuses = [VoterStatus.ACTIVE, VoterStatus.PROVISIONAL] def get_voters(): date1 = datetime.date(2014, 5, 6) date2 = datetime.date(2014, 11, 4) voters = session.query(Voter).join(Ballot).\ join(Household).outerjoin(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter(Household.do_not_contact == None).\ filter( or_( Ballot.election_date == date1, and_( Ballot.election_date == date2, Student.last_name != None ) )).\ group_by(Voter.household_id).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() return voters

Slide 124

Slide 124 text

allowed_statuses = [VoterStatus.ACTIVE, VoterStatus.PROVISIONAL] def get_voters(): date1 = datetime.date(2014, 5, 6) date2 = datetime.date(2014, 11, 4) voters = session.query(Voter).join(Ballot).\ join(Household).outerjoin(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter(Household.do_not_contact == None).\ filter( or_( Ballot.election_date == date1, and_( Ballot.election_date == date2, Student.last_name != None ) )).\ group_by(Voter.household_id).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() return voters

Slide 125

Slide 125 text

allowed_statuses = [VoterStatus.ACTIVE, VoterStatus.PROVISIONAL] def get_voters(): date1 = datetime.date(2014, 5, 6) date2 = datetime.date(2014, 11, 4) voters = session.query(Voter).join(Ballot).\ join(Household).outerjoin(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter(Household.do_not_contact == None).\ filter( or_( Ballot.election_date == date1, and_( Ballot.election_date == date2, Student.last_name != None ) )).\ group_by(Voter.household_id).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() return voters

Slide 126

Slide 126 text

allowed_statuses = [VoterStatus.ACTIVE, VoterStatus.PROVISIONAL] def get_voters(): date1 = datetime.date(2014, 5, 6) date2 = datetime.date(2014, 11, 4) voters = session.query(Voter).join(Ballot).\ join(Household).outerjoin(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter(Household.do_not_contact == None).\ filter( or_( Ballot.election_date == date1, and_( Ballot.election_date == date2, Student.last_name != None ) )).\ group_by(Voter.household_id).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() return voters

Slide 127

Slide 127 text

allowed_statuses = [VoterStatus.ACTIVE, VoterStatus.PROVISIONAL] def get_voters(): date1 = datetime.date(2014, 5, 6) date2 = datetime.date(2014, 11, 4) voters = session.query(Voter).join(Ballot).\ join(Household).outerjoin(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter(Household.do_not_contact == None).\ filter( or_( Ballot.election_date == date1, and_( Ballot.election_date == date2, Student.last_name != None ) )).\ group_by(Voter.household_id).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() return voters

Slide 128

Slide 128 text

allowed_statuses = [VoterStatus.ACTIVE, VoterStatus.PROVISIONAL] def get_voters(): date1 = datetime.date(2014, 5, 6) date2 = datetime.date(2014, 11, 4) voters = session.query(Voter).join(Ballot).\ join(Household).outerjoin(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter(Household.do_not_contact == None).\ filter( or_( Ballot.election_date == date1, and_( Ballot.election_date == date2, Student.last_name != None ) )).\ group_by(Voter.household_id).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() return voters

Slide 129

Slide 129 text

allowed_statuses = [VoterStatus.ACTIVE, VoterStatus.PROVISIONAL] def get_voters(): date1 = datetime.date(2014, 5, 6) date2 = datetime.date(2014, 11, 4) voters = session.query(Voter).join(Ballot).\ join(Household).outerjoin(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter(Household.do_not_contact == None).\ filter( or_( Ballot.election_date == date1, and_( Ballot.election_date == date2, Student.last_name != None ) )).\ group_by(Voter.household_id).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() return voters

Slide 130

Slide 130 text

• 4279 households to mail to • Right in our sweet spot AND within budget How Many?

Slide 131

Slide 131 text

Call Lists

Slide 132

Slide 132 text

Call Lists • One list for each school • Households... • With a registered, active voter... • With a student in the school... • With a phone number... • That didn’t request an absentee ballot... • That aren’t flagged do-not-contact... • De-duplicated across lists

Slide 133

Slide 133 text

def make_list(school_name, file_name, households_seen): households = get_households(school_name) write_csv(filename, households, households_seen) def get_households(school_name): households = session.query(Household).join(Voter).join(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter(Voter.absentee_ballot_requested == None).\ filter(Student.school_name == school_name).\ filter(Household.phone != None).\ filter(Household.do_not_contact == None).\ order_by(Voter.last_name, Voter.first_name).\ all() return households def write_csv(filename, households, households_seen): ... for household in households: if household.id in households_seen: continue row = [...] writer.writerow(row) households_seen[household.id] = household

Slide 134

Slide 134 text

def make_list(school_name, file_name, households_seen): households = get_households(school_name) write_csv(filename, households, households_seen) def get_households(school_name): households = session.query(Household).join(Voter).join(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter(Voter.absentee_ballot_requested == None).\ filter(Student.school_name == school_name).\ filter(Household.phone != None).\ filter(Household.do_not_contact == None).\ order_by(Voter.last_name, Voter.first_name).\ all() return households def write_csv(filename, households, households_seen): ... for household in households: if household.id in households_seen: continue row = [...] writer.writerow(row) households_seen[household.id] = household

Slide 135

Slide 135 text

def make_list(school_name, file_name, households_seen): households = get_households(school_name) write_csv(filename, households, households_seen) def get_households(school_name): households = session.query(Household).join(Voter).join(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter(Voter.absentee_ballot_requested == None).\ filter(Student.school_name == school_name).\ filter(Household.phone != None).\ filter(Household.do_not_contact == None).\ order_by(Voter.last_name, Voter.first_name).\ all() return households def write_csv(filename, households, households_seen): ... for household in households: if household.id in households_seen: continue row = [...] writer.writerow(row) households_seen[household.id] = household

Slide 136

Slide 136 text

def make_list(school_name, file_name, households_seen): households = get_households(school_name) write_csv(filename, households, households_seen) def get_households(school_name): households = session.query(Household).join(Voter).join(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter(Voter.absentee_ballot_requested == None).\ filter(Student.school_name == school_name).\ filter(Household.phone != None).\ filter(Household.do_not_contact == None).\ order_by(Voter.last_name, Voter.first_name).\ all() return households def write_csv(filename, households, households_seen): ... for household in households: if household.id in households_seen: continue row = [...] writer.writerow(row) households_seen[household.id] = household

Slide 137

Slide 137 text

def main(): households_seen = {} make_list(SchoolNames.POKEY_OAKS, '...', households_seen) make_list(SchoolNames.SPRINGFIELD_ELEMENTARY, '...', households_seen) make_list(SchoolNames.HOGWARTS, '...', households_seen) make_list(SchoolNames.MORNING_GLORY, '...', households_seen) make_list(SchoolNames.XAVIERS, '...', households_seen) make_list(SchoolNames.UNSEEN_UNIVERSITY, '...', households_seen)

Slide 138

Slide 138 text

def main(): households_seen = {} make_list(SchoolNames.POKEY_OAKS, '...', households_seen) make_list(SchoolNames.SPRINGFIELD_ELEMENTARY, '...', households_seen) make_list(SchoolNames.HOGWARTS, '...', households_seen) make_list(SchoolNames.MORNING_GLORY, '...', households_seen) make_list(SchoolNames.XAVIERS, '...', households_seen) make_list(SchoolNames.UNSEEN_UNIVERSITY, '...', households_seen)

Slide 139

Slide 139 text

def main(): households_seen = {} make_list(SchoolNames.POKEY_OAKS, '...', households_seen) make_list(SchoolNames.SPRINGFIELD_ELEMENTARY, '...', households_seen) make_list(SchoolNames.HOGWARTS, '...', households_seen) make_list(SchoolNames.MORNING_GLORY, '...', households_seen) make_list(SchoolNames.XAVIERS, '...', households_seen) make_list(SchoolNames.UNSEEN_UNIVERSITY, '...', households_seen)

Slide 140

Slide 140 text

Unreturned Absentee Ballots

Slide 141

Slide 141 text

Unreturned Absentee Ballots • Households with... • One or more registered, active voters... • Who requested an absentee ballot... • But haven’t returned it... • And aren’t flagged do-not-contact

Slide 142

Slide 142 text

def get_voters(): voters = session.query(Voter).join(Household).\ filter(Voter.absentee_ballot_requested != None).\ filter(Voter.absentee_ballot_returned == None).\ filter(Household.do_not_contact == None).\ group_by(Voter.household_id).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() return voters

Slide 143

Slide 143 text

def get_voters(): voters = session.query(Voter).join(Household).\ filter(Voter.absentee_ballot_requested != None).\ filter(Voter.absentee_ballot_returned == None).\ filter(Household.do_not_contact == None).\ group_by(Voter.household_id).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() return voters

Slide 144

Slide 144 text

Shame List

Slide 145

Slide 145 text

Shame List • Registered, active voters who… • Are school district staff, or... • Are early childhood PTA members, or... • Have a student in school • Who haven’t voted recently

Slide 146

Slide 146 text

voters = session.query(Voter).join(Household).outerjoin(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter( or_( Student.first_name != None, or_( Household.staff == True, Household.early_childhood_pta == True ) ) ).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() shame_voters = [ voter for voter in voters if not any( [ballot for ballot in voter.ballots if end_date >= ballot.election_date >= start_date] ) ]

Slide 147

Slide 147 text

voters = session.query(Voter).join(Household).outerjoin(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter( or_( Student.first_name != None, or_( Household.staff == True, Household.early_childhood_pta == True ) ) ).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() shame_voters = [ voter for voter in voters if not any( [ballot for ballot in voter.ballots if end_date >= ballot.election_date >= start_date] ) ]

Slide 148

Slide 148 text

voters = session.query(Voter).join(Household).outerjoin(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter( or_( Student.first_name != None, or_( Household.staff == True, Household.early_childhood_pta == True ) ) ).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() shame_voters = [ voter for voter in voters if not any( [ballot for ballot in voter.ballots if end_date >= ballot.election_date >= start_date] ) ]

Slide 149

Slide 149 text

voters = session.query(Voter).join(Household).outerjoin(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter( or_( Student.first_name != None, or_( Household.staff == True, Household.early_childhood_pta == True ) ) ).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() shame_voters = [ voter for voter in voters if not any( [ballot for ballot in voter.ballots if end_date >= ballot.election_date >= start_date] ) ]

Slide 150

Slide 150 text

voters = session.query(Voter).join(Household).outerjoin(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter( or_( Student.first_name != None, or_( Household.staff == True, Household.early_childhood_pta == True ) ) ).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() shame_voters = [ voter for voter in voters if not any( [ballot for ballot in voter.ballots if end_date >= ballot.election_date >= start_date] ) ]

Slide 151

Slide 151 text

voters = session.query(Voter).join(Household).outerjoin(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter( or_( Student.first_name != None, or_( Household.staff == True, Household.early_childhood_pta == True ) ) ).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() shame_voters = [ voter for voter in voters if not any( [ballot for ballot in voter.ballots if end_date >= ballot.election_date >= start_date] ) ]

Slide 152

Slide 152 text

voters = session.query(Voter).join(Household).outerjoin(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter( or_( Student.first_name != None, or_( Household.staff == True, Household.early_childhood_pta == True ) ) ).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() shame_voters = [ voter for voter in voters if not any( [ballot for ballot in voter.ballots if end_date >= ballot.election_date >= start_date] ) ]

Slide 153

Slide 153 text

voters = session.query(Voter).join(Household).outerjoin(Student).\ filter(Voter.status.in_(allowed_statuses)).\ filter( or_( Student.first_name != None, or_( Household.staff == True, Household.early_childhood_pta == True ) ) ).\ order_by(Voter.last_name, Voter.first_name).\ options(contains_eager(Voter.household)).\ all() shame_voters = [ voter for voter in voters if not any( [ballot for ballot in voter.ballots if end_date >= ballot.election_date >= start_date] ) ]

Slide 154

Slide 154 text

Other Reports

Slide 155

Slide 155 text

Other Reports • Door-to-door list • School-affiliated absentee list • Praise list • Various counts

Slide 156

Slide 156 text

Concluding Thoughts

Slide 157

Slide 157 text

Next Steps • Post-election analysis • Absentee ballot returns • School-affiliated voter turnout • Party affiliation voter turnout • Refactor & repackage for reuse

Slide 158

Slide 158 text

Thoughts / Advice • Model only what you need to model • Code for repeatability • Shared understanding of query specification is 90% of the work • Expect arbitrary changes on short notice • Use your powers for Good

Slide 159

Slide 159 text

Useful Links • Cuyahoga County Board of Elections Data: http://boe.cuyahogacounty.us/en-US/downloads.aspx • USPS Street Suffix Abbreviations: http://pe.usps.gov/text/pub28/28apc_002.htm • SQLAlchemy ORM Tutorial: http://docs.sqlalchemy.org/en/latest/orm/tutorial.html • Alembic Tutorial: http://alembic.readthedocs.org/en/latest/tutorial.html

Slide 160

Slide 160 text

Contact Me Mike Pirnat http://mike.pirnat.com @mpirnat Thanks!