You Used Python For WHAT?!

You Used Python For WHAT?!

A talk I gave on various things I've attempted to do in Python over the years. Given 2012-03-22 at the Boston Python Meetup.

F2d9cd49c78a2f6151175c7c651a2f16?s=128

James Tauber

March 22, 2012
Tweet

Transcript

  1. YOU USED PYTHON FOR WHAT?! James Tauber

  2. YOU USED PYTHON FOR WHAT?! James Tauber ARE USING

  3. None
  4. None
  5. Quisition ?

  6. habitualist

  7. YOU USED PYTHON FOR WHAT?!

  8. jtauber.github.com jtauber.com @jtauber

  9. THE TOPICS •Some Mathematics •Some Weird Programming Languages •Analyzing Keyboard

    Performances •Generating Ancient Greek Graded Readers •Emulating the Apple ][ •Writing an Operating System
  10. PRIMARY GOAL YOU FIND AT LEAST ONE OF THESE INTERESTING

  11. SECONDARY GOAL YOU QUESTION MY SANITY

  12. I DON’T REALLY UNDERSTAND SOMETHING UNTIL I’VE IMPLEMENTED IT IN

    PYTHON
  13. BRAINF***

  14. EIGHT COMMANDS > increment the data pointer < decrement the

    data pointer + increment the byte at the data pointer - decrement the byte at the data pointer . output the byte at the data pointer as an ASCII-encoded character , accept one byte of input, storing its value in the byte at the data pointer [ if the byte at the data pointer is zero, jump forward after matching ] ] if the byte at the data pointer is nonzero, jump back after matching [
  15. >+++++++++[<++++++++>-]<. >+++++++[<++++>-]<+. +++++++. . +++. [-]>++++++++[<++++>-]<. >+++++++++++[<++++++++>-]<-. --------. +++. ------.

    --------. [-]>++++++++[<++++>-]<+. [-]++++++++++.
  16. class brainf: def __init__(self, program): self.mem = [0] * 65536

    self.p = 0 self.pc = 0 self.program = program
  17. def run(self, stop=None): if stop is None: stop = len(self.program)

    while self.pc < stop: c = self.program[self.pc] if c == ">": self.p += 1 elif c == "<": self.p -= 1 elif c == "+": self.mem[self.p] += 1 elif c == "-": self.mem[self.p] -= 1 elif c == ".": sys.stdout.write(chr(self.mem[self.p])) elif c == ",": self.mem[self.p] = ord(sys.stdin.read(1)) elif c == "[": ...
  18. elif c == "[": depth = 1 start = end

    = self.pc while depth: end += 1 if self.program[end] == "[": depth += 1 if self.program[end] == "]": depth -= 1 while self.mem[self.p]: self.pc = start + 1 self.run(end) self.pc = end elif c == "]": raise "unbalanced ]" else: pass self.pc += 1
  19. ,>++++++[<-------->-],[<+>-],<.>.

  20. GIBBONS-LESTER-BIRD ALGORITHM FOR ENUMERATING THE POSITIVE RATIONALS

  21. def reciprocal((n, d)): return (d, n) def one_take((n, d)): return

    (d - n, d) def proper_fraction((n, d)): return (n // d, (n % d, d)) def improper_fraction(i, (n, d)): return ((d * i) + n, d)
  22. def rationals(): r = (0,1) while True: n, y =

    proper_fraction(r) z = improper_fraction(n, one_take(y)) r = reciprocal(z) yield r
  23. def rationals(): r = (0,1) while True: r = (r[1],

    (r[1] * (r[0] // r[1])) + (r[1] - (r[0] % r[1]))) yield r
  24. def rationals(r=(0,1)): while True: r = (r[1], (r[1] * (r[0]

    // r[1])) + (r[1] - (r[0] % r[1]))) yield r
  25. PLOUFFE FORMULA FOR PI IN HEX

  26. def pi(): N = 0 n, d = 0, 1

    while True: xn = (120*N**2 + 151*N + 47) xd = (512*N**4 + 1024*N**3 + 712*N**2 + 194*N + 15) n = ((16 * n * xd) + (xn * d)) % (d * xd) d *= xd yield 16 * n // d N += 1
  27. EVEN WHEN I’VE IMPLEMENTED IT IN PYTHON, THAT DOESN’T MEAN

    I UNDERSTAND IT
  28. CHURCH ENCODING

  29. TRUE = lambda a: lambda b: (a) FALSE = lambda

    a: lambda b: (b) (TRUE)(True)(False) == True (FALSE)(True)(False) == False
  30. AND = lambda a: lambda b: (a)(b)(a) OR = lambda

    a: lambda b: (a)(a)(b) NOT = lambda a: lambda b: lambda c: (a)(c)(b) (AND)(TRUE)(FALSE) == (FALSE) (AND)(TRUE)(FALSE)(True)(False) == False
  31. CONS = lambda a: lambda b: lambda c: (c)(a)(b) CAR

    = lambda a: (a)(TRUE) CDR = lambda a: (a)(FALSE) (CAR)((CONS)(1)(2)) == 1 (CDR)((CONS)(1)(2)) == 2
  32. UNCHURCH_BOOLEAN = (CONS)(True)(False) (UNCHURCH_BOOLEAN)((NOT)(TRUE)) == False (UNCHURCH_BOOLEAN)((OR)(TRUE)(FALSE)) == True

  33. ZERO = FALSE SUCC = lambda a: lambda b: lambda

    c: (b)((a)(b)(c)) ONE = (SUCC)(ZERO) TWO = (SUCC)(ONE) THREE = (SUCC)(TWO) FOUR = (SUCC)(THREE) def church_number(n): return SUCC(church_number(n - 1)) if n else FALSE
  34. PLUS = lambda a: lambda b: lambda c: lambda d:

    (a)(c)((b)(c)(d)) MULT = lambda a: lambda b: lambda c: (b)((a)(c)) EXP = lambda a: lambda b: (b)(a) UNCHURCH_NUMBER = lambda a: (a)(lambda b: b + 1)(0)
  35. (UNCHURCH_NUMBER)(ZERO) == 0 (UNCHURCH_NUMBER)(ONE) == 1 (UNCHURCH_NUMBER)(TWO) == 2 (UNCHURCH_NUMBER)((PLUS)(THREE)(TWO))

    == 5 (UNCHURCH_NUMBER)((MULT)(THREE)(TWO)) == 6 (UNCHURCH_NUMBER)((EXP)(THREE)(TWO)) == 9
  36. GIT-STYLE VERSIONING OF PYTHON DATA STRUCTURES

  37. GIT FUNDAMENTALS • each file’s contents (called a blob) are

    hashed and that hash becomes a key into a dictionary of files • each directory is viewed as a sorted list of filenames paired with hash of file’s contents • this directory listing can then be hashed and so an entire directory tree can be hashed • a tree’s hash + commit message + committer + timestamp + parent(s) can then be hashed
  38. FILES AND DIRECTORIES ARE KINDA LIKE STRINGS AND DICTIONARIES

  39. class NodeBase: def __init__(self, repo, content): self.repo = repo self.content

    = self.shrink(content) def __bytes__(self): return ("%s\n%r" % (self.__class__, self.content) ).encode("utf-8") class Atom(NodeBase): def shrink(self, content): return content def expand(self): return self.content
  40. class List(NodeBase): def shrink(self, content): return [ self.repo.shrink(item) for item

    in content ] def expand(self): return [ self.repo.expand(item) for item in self.content ]
  41. class Repo: def __init__(self): self.objects = {} self.refs = {}

    self.HEAD = "master" def store(self, obj): sha = hashlib.sha1(bytes(obj)).hexdigest() self.objects[sha] = obj return sha
  42. def shrink(self, content): if isinstance(content, dict): return self.store(Dictionary(self, content)) elif

    isinstance(content, list): return self.store(List(self, content)) elif isinstance(content, tuple): return self.store(Tuple(self, content)) else: return self.store(Atom(self, content))
  43. def expand(self, sha): return self.objects[sha].expand()

  44. def create_commit(self, obj_sha, message, parents=None): if parents is None: parents

    = [] return self.store( Commit(self, obj_sha, message, parents)) def commit(self, obj, message): obj_sha = self.shrink(obj) old_head = self.refs[self.HEAD] commit_sha = self.create_commit( obj_sha, message, parents=[old_head]) self.refs[self.HEAD] = commit_sha return commit_sha
  45. def create_branch(self, branch_name, commit_sha=None): if commit_sha is None: commit_sha =

    self.refs[self.HEAD] self.refs[branch_name] = self.refs[self.HEAD] def checkout_branch(self, branch_name): self.HEAD = branch_name
  46. FUTURE PLANS •more object types and representations •merging •diffing •remotes

  47. CZERNY KEYBOARD PERFORMANCE ANALYSIS

  48. THE BASIC IDEA •record a performance of the exercise /

    piece as MIDI (or similar) events •align the performed notes with the “score” notes •identify errors as well as fluctuations in timing, velocity, etc. •basically a performance “diff”
  49. A_notes = [1, 2, 3, 2, 1, 3, 4, 5,

    6, 5, 4, 5, 6, 5, 4, 3] B_notes = [6, 5, 4, 5, 6, 4, 3, 2, 1, 2, 3, 2, 1, 2, 3, 4] C_notes = [6, 5, 4, 5, 6, 4, 3, 2, 1, 2, 3, 2, 1, 2, 3, 2] D_notes = [1] scale = [0, 2, 4, 5, 7, 9, 11] full_scale = scale + [12 + i for i in scale] + [24 + i for i in scale] sections = [ (A_notes, 4, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]), (B_notes, 4, [13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1]), (C_notes, 4, [0]), (D_notes, 32, [0]) ] for section in sections: pattern, duration_64, offset = section for o in offset: for note in pattern: print >>f, 48 + full_scale[note + o - 1], duration_64
  50. 48 4 50 4 52 4 50 4 48 4

    52 4 ... SCORE
  51. #!/usr/bin/env python from mac import CoreMIDI import time start =

    time.time() def callback(event): if event[0] == 156: print time.time() - start, event[1], event[2] CoreMIDI.pyCallback = callback while True: pass
  52. 4.44376397133 48 73 4.6487929821 50 59 4.68273806572 48 0 4.81475901604

    50 0 4.83673501015 52 66 5.03977394104 50 69 5.04273104668 52 0 5.23374605179 50 0 5.24569511414 48 70 5.45274806023 52 63 5.4536960125 48 0 5.63474702835 52 0 PERFORMANCE
  53. align with Needleman-Wunsch algorithm def note_similarity(score_note, performance_note): if score_note[0] ==

    performance_note[1]: return 1 elif abs(score_note[0] - performance_note[1]) < 3: return 0.5 else: return 0 with a -1 for insertions and deletions plan to tweak to include velocity, duration, etc
  54. FUTURE PLANS •represent performance as MIDI file •represent score as

    lilypond (and/or Sebastian) •generate anotated score showing mistakes •with fingering information, highlight deficiencies of particular fingers •grade performances •study expression
  55. GRADED-READER GENERATION

  56. THE BASIC IDEA •overcoming limitations of inductive, corpus-based learning of

    a language •generate graded readers •build up from much smaller pieces •optimize ordering of vocabulary •present in larger context even when context is not yet understood in target language
  57. THE MYTH OF VOCABULARY COVERAGE The 10 most common words

    account for 37% of the text The 100 most common words account for 66% of the text
  58. THE MYTH OF VOCABULARY COVERAGE if you learn top 100

    words, you’ll know... one word in 99.9% of verses 50% of words in 91.3% of verses 75% of words in 24.4% of verses 90% of words in 2.1% of verses 95% of words in 0.6% of verses 100% of words in 0.4% of verses
  59. THE MYTH OF VOCABULARY COVERAGE if you learn top 100

    words, you’ll know... one word in 99.9% of verses 50% of words in 91.3% of verses 75% of words in 24.4% of verses 90% of words in 2.1% of verses 95% of words in 0.6% of verses 100% of words in 0.4% of verses
  60. VOCABULARY ORDERING •frequency ordering is far from optimal •vocabulary ordering

    as traveling salesman problem •simulated annealing
  61. def calc_score(self, target_list): known_items = set() score = 0.0 num_targets

    = float(len(target_list)) for step, target in enumerate(target_list): to_learn = target.prereqs - known_items score += len(to_learn) * step / num_targets known_items.update(to_learn) return score SIMULATED ANNEALING
  62. while temp > final_temp: for i in range(iterations): s1 =

    self.scorer.calc_score(target_list) p1 = randrange(0, num_targets) p2 = randrange(0, num_targets) new_list = self.swap(target_list, p1, p2) s2 = self.scorer.calc_score(new_list) if s2 > s1: target_list = new_list else: if random() < exp((s2 - s1) / temp): target_list = new_list temp = temp * alpha
  63. # a dictionary mapping targets to a set of items

    that are # needed (and initially missing) MISSING_IN_TARGET = defaultdict(set) ... # for each item, a score of how bad it is that it is missing MISSING_ITEMS = defaultdict(int) for missing in MISSING_IN_TARGET.values(): for item in missing: MISSING_ITEMS[item] += 1. / (2 ** len(missing)) if not MISSING_ITEMS: break next_item = sorted(MISSING_ITEMS, key=MISSING_ITEMS.get)[-1] for target in TARGETS_MISSING[next_item]: MISSING_IN_TARGET[target].remove(next_item)
  64. OTHER WORK AND FUTURE PLANS •consideration of other prerequisite knowledge

    •intrinsic difficulty of a word (cognates, etc) •partial dynamic interlinears •inline replacement
  65. CLEESE AN OPERATING SYSTEM IN PYTHON

  66. THE BASIC IDEA •run the Python VM on bare metal

    •remove parts of CPython that assume an operating system •minimize libc •add built-ins to directly access memory and I/O ports •write drivers, etc in Python •even “must be written in assembly” parts use python-like syntax
  67. static PyMethodDef ports_methods[] = { ! {"inb", ports_inb, METH_VARARGS, NULL},

    ! {"outb", ports_outb, METH_VARARGS, NULL}, ! {NULL, NULL}, }; PyObject * _Ports_Init(void) { ! return Py_InitModule4("ports", ports_methods, ! ! NULL, (PyObject *)NULL, PYTHON_API_VERSION); }
  68. static PyObject * ports_outb(PyObject *self, PyObject *args) { ! PyObject

    *v, *a; ! if(!PyArg_UnpackTuple(args, "outb", 2, 2, &v, &a)) ! ! return NULL; ! if(!PyInt_CheckExact(a)) ! ! return NULL; ! if(PyInt_CheckExact(v)) ! ! outb(PyInt_AS_LONG(a),PyInt_AS_LONG(v)); ! else if(PyString_Check(v) && PyString_GET_SIZE(v) == 1) ! ! outb(PyInt_AS_LONG(a), PyString_AS_STRING(v)[0]); ! else ! ! return NULL; ! ! Py_INCREF(Py_True); ! return Py_True; }
  69. import ports def on(freq): if freq: ports.outb(freq, 0x42) ports.outb(freq >>

    8, 0x42) ports.outb(ports.inb(0x61) | 0x03, 0x61) else: off() def off(): ! ports.outb(inb(0x61) & 0xFC, 0x61) SPEAKER
  70. import ports def get_scancode(): while not (ports.inb(0x64) & 0x21) ==

    0x01: pass return ports.inb(0x60) KEYBOARD
  71. TWO APPROACHES TO VM •start with full CPython and remove

    bits as you hit problems •start with nothing and add just the CPython bits you need
  72. FUTURE PLANS •proper memory management •less task-specific subsetting of CPython

    •PyPy instead of CPython?
  73. APPLEPY AN APPLE ][ EMULATOR

  74. self.ops[0xA8] = lambda: self.TAY() self.ops[0xA9] = lambda: self.LDA(self.immediate_mode()) self.ops[0xAA] =

    lambda: self.TAX() self.ops[0xAC] = lambda: self.LDY(self.absolute_mode()) self.ops[0xAD] = lambda: self.LDA(self.absolute_mode())
  75. def immediate_mode(self): return self.get_pc() def absolute_mode(self): self.cycles += 2 return

    self.read_pc_word()
  76. def LDA(self, operand_address): self.accumulator = self.update_nz( self.read_byte(operand_address))

  77. def update_nz(self, value): value = value % 0x100 self.zero_flag =

    [0, 1][(value == 0)] self.sign_flag = [0, 1][((value & 0x80) != 0)] return value
  78. class SoftSwitches: ... def read_byte(self, cycle, address): assert 0xC000 <=

    address <= 0xCFFF if address == 0xC000: return self.kbd elif address == 0xC010: self.kbd = self.kbd & 0x7F elif address == 0xC030: if self.speaker: self.speaker.toggle(cycle)
  79. THE REAL CHALLENGES •character generator •cassette interface •speaker •graphics •(disk)

  80. OTHER THINGS •Forth Interpreter •Pascal Interpreter (as a prolegomenon to

    even more insanity) •Relational Algebra Implementation •Quantum Computing •Unicode Collation Algorithm Implementation •Global Illumination Renderer •Hacking Data Files from Lord of the Rings Online and Skyrim
  81. jtauber.github.com jtauber.com @jtauber