Slide 1

Slide 1 text

No content

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

PROGRAMMING STYLES Rules and constraints in software construction

Slide 4

Slide 4 text

Programming Styles ⊳ Ways of expressing tasks ⊳ Exist and recur at all scales ⊳ Frozen in Programming Languages

Slide 5

Slide 5 text

Programming Styles How do you communicate this?

Slide 6

Slide 6 text

Raymond Queneau

Slide 7

Slide 7 text

Queneau’s Exercises in Style ⊳ Metaphor ⊳ Surprises ⊳ Dream ⊳ Prognostication ⊳ Hesitation ⊳ Precision ⊳ Negativities ⊳ Asides ⊳ Anagrams ⊳ Logical analysis ⊳ Past ⊳ Present ⊳ … ⊳ (99)

Slide 8

Slide 8 text

Oulipo’s “Styles” ⊳ Constraints ⊳ Potential literature: "the seeking of new structures and patterns which may be used by writers in any way they enjoy." ⊳ E.g. “A Void” (La Disparition) by Georges Perec

Slide 9

Slide 9 text

Exercises in Programming Style The story: Term Frequency given a text file, output a list of the 25 most frequently-occurring words, ordered by decreasing frequency

Slide 10

Slide 10 text

Exercises in Programming Style The story: Term Frequency given a text file, output a list of the 25 most frequently-occurring words, ordered by decreasing frequency mr - 786 elizabeth - 635 very - 488 darcy - 418 such - 395 mrs - 343 much - 329 more - 327 bennet - 323 bingley - 306 jane - 295 miss - 283 one - 275 know - 239 before - 229 herself - 227 though - 226 well - 224 never - 220 … TF Pride and Prejudice

Slide 11

Slide 11 text

EPS, the book ⊳ Part I: Historical ⊳ Part II: Basic Styles ⊳ Part III: Function Composition ⊳ Part IV: Objects and Object Interaction ⊳ Part V: Reflection and Metaprogramming ⊳ Part VI: Adversity ⊳ Part VII: Data-Centric ⊳ Part VIII: Concurrency ⊳ Part IX: Interactivity

Slide 12

Slide 12 text

STYLE #3

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

# the global list of [word, frequency] pairs word_freqs = [] # the list of stop words with open('../stop_words.txt') as f: stop_words = f.read().split(',') stop_words.extend(list(string.ascii_lowercase))

Slide 15

Slide 15 text

for line in open(sys.argv[1]): for c in line:

Slide 16

Slide 16 text

Style #3 Constraints ⊳ No abstractions ⊳ No use of library functions

Slide 17

Slide 17 text

Style #3 Constraints ⊳ No abstractions ⊳ No use of library functions Monolithic Style

Slide 18

Slide 18 text

STYLE #6 @cristalopes #style2 name

Slide 19

Slide 19 text

import re, sys, collections stopwords = set(open('../stop_words.txt').read().split(',')) words = re.findall('[a-z]{2,}', open(sys.argv[1]).read().lower()) counts = collections.Counter(w for w in words if w not in stopwords) for (w, c) in counts.most_common(25): print w, '-', c Credit: Peter Norvig

Slide 20

Slide 20 text

import re, sys, collections stopwords=set(open('../stop_words.txt').read().split(',')) words = re.findall('[a-z]{2,}', open(sys.argv[1]).read().lower()) counts = collections.Counter(w for w in words \ if w not in stopwords) for (w, c) in counts.most_common(25): print w, '-', c

Slide 21

Slide 21 text

import re, string, sys stops = set(open("../stop_words.txt").read().split(",") + list(string.ascii_lowercase)) words = [x.lower() for x in re.split("[^a-zA-Z]+", open(sys.argv[1]).read()) if len(x) > 0 and x.lower() not in stops] unique_words = list(set(words)) unique_words.sort(lambda x,y:cmp(words.count(y), words.count(x))) print "\n".join(["%s - %s" % (x, words.count(x)) for x in unique_words[:25]])

Slide 22

Slide 22 text

Style #6 Constraints ⊳ As few lines of code as possible

Slide 23

Slide 23 text

Style #6 Constraints ⊳ As few lines of code as possible Code Golf Style

Slide 24

Slide 24 text

Style #6 Constraints ⊳ As few lines of code as possible Try Hard Style

Slide 25

Slide 25 text

STYLE #4

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

# # Main # read_file(sys.argv[1]) filter_normalize() scan() rem_stop_words() frequencies() sort() for tf in word_freqs[0:25]: print tf[0], ' - ', tf[1] def read_file(path): def filter_normalize(): def scan(): def rem_stop_words(): def frequencies(): def sort(): data=[] words=[] freqs=[]

Slide 28

Slide 28 text

Style #4 Constraints ⊳ Procedural abstractions • maybe input, no output ⊳ Shared state ⊳ Larger problem solved by applying procedures, one after the other, changing the shared state

Slide 29

Slide 29 text

Style #4 Constraints ⊳ Procedural abstractions • maybe input, no output ⊳ Shared state ⊳ Series of commands Cook Book Style

Slide 30

Slide 30 text

STYLE #5

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

# # Main # wfreqs=st(fq(r(sc(n(fc(rf(sys.argv[1]))))))) for tf in wfreqs[0:25]: print tf[0], ' - ', tf[1] def read_file(path): def filter(str_data): def scan(str_data): def rem_stop_words(wordl): def frequencies(wordl): def sort(word_freqs): def normalize(str_data): return ... return ... return ... return ... return ... return ... return ...

Slide 33

Slide 33 text

Style #5 Constraints ⊳ Function abstractions • f: Input  Output ⊳ No shared state ⊳ Function composition f º g

Slide 34

Slide 34 text

Style #5 Constraints ⊳ Function abstractions • f: Input  Output ⊳ No shared state ⊳ Function composition f º g Pipeline Style

Slide 35

Slide 35 text

STYLE #8

Slide 36

Slide 36 text

No content

Slide 37

Slide 37 text

def read_file(path, func): ... return func(…, normalize) def filter_chars(data, func): ... return func(…, scan) def normalize(data, func): ... return func(…,remove_stops) # Main w_freqs=read_file(sys.argv[1], filter_chars) for tf in w_freqs[0:25]: print tf[0], ' - ', tf[1] def scan(data, func): ... return func(…, frequencies) def remove_stops(data, func): ... return func(…, sort) Etc.

Slide 38

Slide 38 text

Style #8 Constraints ⊳ Functions take one additional parameter, f • called at the end • given what would normally be the return value plus the next function

Slide 39

Slide 39 text

Style #8 Constraints ⊳ Functions take one additional parameter, f • called at the end • given what would normally be the return value plus the next function Kick forward Style

Slide 40

Slide 40 text

STYLE #10

Slide 41

Slide 41 text

No content

Slide 42

Slide 42 text

class DataStorageManager(TFExercise): class TFExercise(): class StopWordManager(TFExercise): class WordFreqManager(TFExercise): class WordFreqController(TFExercise): # Main WordFreqController(sys.argv[1]).run() def words(self): def info(self): def info(self): def info(self): def info(self): def is_stop_word(self, word): def inc_count(self, word): def sorted(self): def run(self):

Slide 43

Slide 43 text

Style #10 Constraints ⊳ Things, things and more things! • Capsules of data and procedures ⊳ Data is never accessed directly ⊳ Capsules can reappropriate procedures from other capsules

Slide 44

Slide 44 text

Style #10 Constraints ⊳ Things, things and more things! • Capsules of data and procedures ⊳ Data is never accessed directly ⊳ Capsules can reappropriate procedures from other capsules Kingdom of Nouns Style

Slide 45

Slide 45 text

STYLE #11

Slide 46

Slide 46 text

No content

Slide 47

Slide 47 text

class DataStorageManager(): class StopWordManager(): class WordFrequencyManager(): class WordFrequencyController(): def dispatch(self, message): def dispatch(self, message): def dispatch(self, message): def dispatch(self, message): # Main wfcntrl = WordFrequencyController() wfcntrl.dispatch([‘init’,sys.argv[1]]) wfcntrl.dispatch([‘run’])

Slide 48

Slide 48 text

Style #11 Constraints ⊳ (Similar to #10) ⊳ Capsules receive messages via single receiving procedure

Slide 49

Slide 49 text

Style #11 Constraints ⊳ (Similar to #10) ⊳ Capsules receive messages via single receiving procedure Letterbox Style

Slide 50

Slide 50 text

STYLE #30

Slide 51

Slide 51 text

No content

Slide 52

Slide 52 text

# Main splits = map(split_words, partition(read_file(sys.argv[1]), 200)) splits.insert(0, []) word_freqs = sort(reduce(count_words, splits)) for tf in word_freqs[0:25]: print tf[0], ' - ', tf[1]

Slide 53

Slide 53 text

def split_words(data_str) """ Takes a string (many lines), filters, normalizes to lower case, scans for words, and filters the stop words. Returns a list of pairs (word, 1), so [(w1, 1), (w2, 1), ..., (wn, 1)] """ ... result = [] words = _rem_stop_words(_scan(_normalize(_filter(data_str)))) for w in words: result.append((w, 1)) return result

Slide 54

Slide 54 text

def count_words(pairs_list_1, pairs_list_2) """ Takes two lists of pairs of the form [(w1, 1), ...] and returns a list of pairs [(w1, frequency), ...], where frequency is the sum of all occurrences """ mapping = dict((k, v) for k, v in pairs_list_1) for p in pairs_list_2: if p[0] in mapping: mapping[p[0]] += p[1] else: mapping[p[0]] = 1 return mapping.items()

Slide 55

Slide 55 text

Style #30 Constraints ⊳ Two key abstractions: map(f, chunks) and reduce(g, results)

Slide 56

Slide 56 text

Style #30 Constraints ⊳ Two key abstractions: map(f, chunks) and reduce(g, results) Map-Reduce Style

Slide 57

Slide 57 text

STYLE #25

Slide 58

Slide 58 text

No content

Slide 59

Slide 59 text

# Main connection = sqlite3.connect(':memory:') create_db_schema(connection) load_file_into_database(sys.argv[1], connection) # Now, let's query c = connection.cursor() c.execute("SELECT value, COUNT(*) as C FROM words GROUP BY value ORDER BY C DESC") for i in range(25): row = c.fetchone() if row != None: print row[0] + ' - ' + str(row[1]) connection.close()

Slide 60

Slide 60 text

def create_db_schema(connection): c = connection.cursor() c.execute('''CREATE TABLE documents(id PRIMARY KEY AUTOINCREMENT, name)''' c.execute('''CREATE TABLE words(id, doc_id, value)''') c.execute('''CREATE TABLE characters(id, word_id, value)''') connection.commit() c.close()

Slide 61

Slide 61 text

# Now let's add data to the database # Add the document itself to the database c = connection.cursor() c.execute("INSERT INTO documents (name) VALUES (?)", (path_to_f c.execute("SELECT id from documents WHERE name=?", (path_to_fil doc_id = c.fetchone()[0] # Add the words to the database c.execute("SELECT MAX(id) FROM words") row = c.fetchone() word_id = row[0] if word_id == None: word_id = 0 for w in words: c.execute("INSERT INTO words VALUES (?, ?, ?)", (word_id, d # Add the characters to the database char_id = 0 for char in w: c.execute("INSERT INTO characters VALUES (?, ?, ?)", (c char_id += 1 word_id += 1 connection.commit() c.close()

Slide 62

Slide 62 text

Style #25 Constraints ⊳ Entities and relations between them ⊳ Query engine • Declarative queries

Slide 63

Slide 63 text

Style #25 Constraints ⊳ Entities and relations between them ⊳ Query engine • Declarative queries Persistent Tables Style

Slide 64

Slide 64 text

Take Home ⊳ Many ways of solving problems • Know them, assess them • What are you trying to optimize? ⊳ Constraints are important for communication • Make them explicit ⊳ Don’t be hostage of one way of doing things

Slide 65

Slide 65 text

@cristalopes