Upgrade to Pro — share decks privately, control downloads, hide ads and more …

You Used Python For WHAT?!

You Used Python For WHAT?!

A talk I gave on various things I've attempted to do in Python over the years. Given 2012-03-22 at the Boston Python Meetup.

James Tauber

March 22, 2012
Tweet

More Decks by James Tauber

Other Decks in Programming

Transcript

  1. YOU USED PYTHON
    FOR WHAT?!
    James Tauber

    View Slide

  2. YOU USED PYTHON
    FOR WHAT?!
    James Tauber
    ARE USING

    View Slide

  3. View Slide

  4. View Slide

  5. Quisition
    ?

    View Slide

  6. habitualist

    View Slide

  7. YOU USED PYTHON
    FOR WHAT?!

    View Slide

  8. jtauber.github.com
    jtauber.com
    @jtauber

    View Slide

  9. THE TOPICS
    •Some Mathematics
    •Some Weird Programming Languages
    •Analyzing Keyboard Performances
    •Generating Ancient Greek Graded Readers
    •Emulating the Apple ][
    •Writing an Operating System

    View Slide

  10. PRIMARY GOAL
    YOU FIND AT LEAST ONE OF THESE
    INTERESTING

    View Slide

  11. SECONDARY GOAL
    YOU QUESTION MY SANITY

    View Slide

  12. I DON’T REALLY UNDERSTAND
    SOMETHING UNTIL I’VE
    IMPLEMENTED IT IN PYTHON

    View Slide

  13. BRAINF***

    View Slide

  14. EIGHT COMMANDS
    > increment the data pointer
    < decrement the data pointer
    + increment the byte at the data pointer
    - decrement the byte at the data pointer
    . output the byte at the data pointer as an ASCII-encoded character
    , accept one byte of input, storing its value in the byte at the data pointer
    [ if the byte at the data pointer is zero, jump forward after matching ]
    ] if the byte at the data pointer is nonzero, jump back after matching [

    View Slide

  15. >+++++++++[<++++++++>-]<.
    >+++++++[<++++>-]<+.
    +++++++.
    .
    +++.
    [-]>++++++++[<++++>-]<.
    >+++++++++++[<++++++++>-]<-.
    --------.
    +++.
    ------.
    --------.
    [-]>++++++++[<++++>-]<+.
    [-]++++++++++.

    View Slide

  16. class brainf:
    def __init__(self, program):
    self.mem = [0] * 65536
    self.p = 0
    self.pc = 0
    self.program = program

    View Slide

  17. def run(self, stop=None):
    if stop is None:
    stop = len(self.program)
    while self.pc < stop:
    c = self.program[self.pc]
    if c == ">": self.p += 1
    elif c == "<": self.p -= 1
    elif c == "+": self.mem[self.p] += 1
    elif c == "-": self.mem[self.p] -= 1
    elif c == ".": sys.stdout.write(chr(self.mem[self.p]))
    elif c == ",": self.mem[self.p] = ord(sys.stdin.read(1))
    elif c == "[":
    ...

    View Slide

  18. elif c == "[":
    depth = 1
    start = end = self.pc
    while depth:
    end += 1
    if self.program[end] == "[":
    depth += 1
    if self.program[end] == "]":
    depth -= 1
    while self.mem[self.p]:
    self.pc = start + 1
    self.run(end)
    self.pc = end
    elif c == "]":
    raise "unbalanced ]"
    else:
    pass
    self.pc += 1

    View Slide

  19. ,>++++++[<-------->-],[<+>-],<.>.

    View Slide

  20. GIBBONS-LESTER-BIRD
    ALGORITHM FOR ENUMERATING THE
    POSITIVE RATIONALS

    View Slide

  21. def reciprocal((n, d)):
    return (d, n)
    def one_take((n, d)):
    return (d - n, d)
    def proper_fraction((n, d)):
    return (n // d, (n % d, d))
    def improper_fraction(i, (n, d)):
    return ((d * i) + n, d)

    View Slide

  22. def rationals():
    r = (0,1)
    while True:
    n, y = proper_fraction(r)
    z = improper_fraction(n, one_take(y))
    r = reciprocal(z)
    yield r

    View Slide

  23. def rationals():
    r = (0,1)
    while True:
    r = (r[1], (r[1] * (r[0] // r[1])) +
    (r[1] - (r[0] % r[1])))
    yield r

    View Slide

  24. def rationals(r=(0,1)):
    while True:
    r = (r[1], (r[1] * (r[0] // r[1])) +
    (r[1] - (r[0] % r[1])))
    yield r

    View Slide

  25. PLOUFFE FORMULA
    FOR PI IN HEX

    View Slide

  26. def pi():
    N = 0
    n, d = 0, 1
    while True:
    xn = (120*N**2 + 151*N + 47)
    xd = (512*N**4 + 1024*N**3 + 712*N**2 + 194*N + 15)
    n = ((16 * n * xd) + (xn * d)) % (d * xd)
    d *= xd
    yield 16 * n // d
    N += 1

    View Slide

  27. EVEN WHEN I’VE IMPLEMENTED IT
    IN PYTHON, THAT DOESN’T MEAN I
    UNDERSTAND IT

    View Slide

  28. CHURCH ENCODING

    View Slide

  29. TRUE = lambda a: lambda b: (a)
    FALSE = lambda a: lambda b: (b)
    (TRUE)(True)(False) == True
    (FALSE)(True)(False) == False

    View Slide

  30. AND = lambda a: lambda b: (a)(b)(a)
    OR = lambda a: lambda b: (a)(a)(b)
    NOT = lambda a: lambda b: lambda c: (a)(c)(b)
    (AND)(TRUE)(FALSE) == (FALSE)
    (AND)(TRUE)(FALSE)(True)(False) == False

    View Slide

  31. CONS = lambda a: lambda b: lambda c: (c)(a)(b)
    CAR = lambda a: (a)(TRUE)
    CDR = lambda a: (a)(FALSE)
    (CAR)((CONS)(1)(2)) == 1
    (CDR)((CONS)(1)(2)) == 2

    View Slide

  32. UNCHURCH_BOOLEAN = (CONS)(True)(False)
    (UNCHURCH_BOOLEAN)((NOT)(TRUE)) == False
    (UNCHURCH_BOOLEAN)((OR)(TRUE)(FALSE)) == True

    View Slide

  33. ZERO = FALSE
    SUCC = lambda a: lambda b: lambda c: (b)((a)(b)(c))
    ONE = (SUCC)(ZERO)
    TWO = (SUCC)(ONE)
    THREE = (SUCC)(TWO)
    FOUR = (SUCC)(THREE)
    def church_number(n):
    return SUCC(church_number(n - 1)) if n else FALSE

    View Slide

  34. PLUS = lambda a: lambda b: lambda c: lambda d:
    (a)(c)((b)(c)(d))
    MULT = lambda a: lambda b: lambda c: (b)((a)(c))
    EXP = lambda a: lambda b: (b)(a)
    UNCHURCH_NUMBER = lambda a: (a)(lambda b: b + 1)(0)

    View Slide

  35. (UNCHURCH_NUMBER)(ZERO) == 0
    (UNCHURCH_NUMBER)(ONE) == 1
    (UNCHURCH_NUMBER)(TWO) == 2
    (UNCHURCH_NUMBER)((PLUS)(THREE)(TWO)) == 5
    (UNCHURCH_NUMBER)((MULT)(THREE)(TWO)) == 6
    (UNCHURCH_NUMBER)((EXP)(THREE)(TWO)) == 9

    View Slide

  36. GIT-STYLE VERSIONING OF
    PYTHON DATA STRUCTURES

    View Slide

  37. GIT FUNDAMENTALS
    • each file’s contents (called a blob) are hashed and that hash becomes a key into a dictionary of files
    • each directory is viewed as a sorted list of filenames paired with hash of file’s contents
    • this directory listing can then be hashed and so an entire directory tree can be hashed
    • a tree’s hash + commit message + committer + timestamp + parent(s) can then be hashed

    View Slide

  38. FILES AND DIRECTORIES
    ARE KINDA LIKE
    STRINGS AND DICTIONARIES

    View Slide

  39. class NodeBase:
    def __init__(self, repo, content):
    self.repo = repo
    self.content = self.shrink(content)
    def __bytes__(self):
    return ("%s\n%r" %
    (self.__class__, self.content)
    ).encode("utf-8")
    class Atom(NodeBase):
    def shrink(self, content):
    return content
    def expand(self):
    return self.content

    View Slide

  40. class List(NodeBase):
    def shrink(self, content):
    return [
    self.repo.shrink(item) for item in content
    ]
    def expand(self):
    return [
    self.repo.expand(item) for item in self.content
    ]

    View Slide

  41. class Repo:
    def __init__(self):
    self.objects = {}
    self.refs = {}
    self.HEAD = "master"
    def store(self, obj):
    sha = hashlib.sha1(bytes(obj)).hexdigest()
    self.objects[sha] = obj
    return sha

    View Slide

  42. def shrink(self, content):
    if isinstance(content, dict):
    return self.store(Dictionary(self, content))
    elif isinstance(content, list):
    return self.store(List(self, content))
    elif isinstance(content, tuple):
    return self.store(Tuple(self, content))
    else:
    return self.store(Atom(self, content))

    View Slide

  43. def expand(self, sha):
    return self.objects[sha].expand()

    View Slide

  44. def create_commit(self, obj_sha, message,
    parents=None):
    if parents is None:
    parents = []
    return self.store(
    Commit(self, obj_sha, message, parents))
    def commit(self, obj, message):
    obj_sha = self.shrink(obj)
    old_head = self.refs[self.HEAD]
    commit_sha = self.create_commit(
    obj_sha, message, parents=[old_head])
    self.refs[self.HEAD] = commit_sha
    return commit_sha

    View Slide

  45. def create_branch(self, branch_name, commit_sha=None):
    if commit_sha is None:
    commit_sha = self.refs[self.HEAD]
    self.refs[branch_name] = self.refs[self.HEAD]
    def checkout_branch(self, branch_name):
    self.HEAD = branch_name

    View Slide

  46. FUTURE PLANS
    •more object types and representations
    •merging
    •diffing
    •remotes

    View Slide

  47. CZERNY
    KEYBOARD PERFORMANCE ANALYSIS

    View Slide

  48. THE BASIC IDEA
    •record a performance of the exercise / piece as MIDI (or similar) events
    •align the performed notes with the “score” notes
    •identify errors as well as fluctuations in timing, velocity, etc.
    •basically a performance “diff”

    View Slide

  49. A_notes = [1, 2, 3, 2, 1, 3, 4, 5, 6, 5, 4, 5, 6, 5, 4, 3]
    B_notes = [6, 5, 4, 5, 6, 4, 3, 2, 1, 2, 3, 2, 1, 2, 3, 4]
    C_notes = [6, 5, 4, 5, 6, 4, 3, 2, 1, 2, 3, 2, 1, 2, 3, 2]
    D_notes = [1]
    scale = [0, 2, 4, 5, 7, 9, 11]
    full_scale = scale + [12 + i for i in scale] + [24 + i for i in scale]
    sections = [
    (A_notes, 4, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]),
    (B_notes, 4, [13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1]),
    (C_notes, 4, [0]),
    (D_notes, 32, [0])
    ]
    for section in sections:
    pattern, duration_64, offset = section
    for o in offset:
    for note in pattern:
    print >>f, 48 + full_scale[note + o - 1], duration_64

    View Slide

  50. 48 4
    50 4
    52 4
    50 4
    48 4
    52 4
    ...
    SCORE

    View Slide

  51. #!/usr/bin/env python
    from mac import CoreMIDI
    import time
    start = time.time()
    def callback(event):
    if event[0] == 156:
    print time.time() - start, event[1], event[2]
    CoreMIDI.pyCallback = callback
    while True:
    pass

    View Slide

  52. 4.44376397133 48 73
    4.6487929821 50 59
    4.68273806572 48 0
    4.81475901604 50 0
    4.83673501015 52 66
    5.03977394104 50 69
    5.04273104668 52 0
    5.23374605179 50 0
    5.24569511414 48 70
    5.45274806023 52 63
    5.4536960125 48 0
    5.63474702835 52 0
    PERFORMANCE

    View Slide

  53. align with Needleman-Wunsch algorithm
    def note_similarity(score_note, performance_note):
    if score_note[0] == performance_note[1]:
    return 1
    elif abs(score_note[0] - performance_note[1]) < 3:
    return 0.5
    else:
    return 0
    with a -1 for insertions and deletions
    plan to tweak to include velocity, duration, etc

    View Slide

  54. FUTURE PLANS
    •represent performance as MIDI file
    •represent score as lilypond (and/or Sebastian)
    •generate anotated score showing mistakes
    •with fingering information, highlight deficiencies of particular fingers
    •grade performances
    •study expression

    View Slide

  55. GRADED-READER GENERATION

    View Slide

  56. THE BASIC IDEA
    •overcoming limitations of inductive, corpus-based learning of a language
    •generate graded readers
    •build up from much smaller pieces
    •optimize ordering of vocabulary
    •present in larger context even when context is not yet understood in
    target language

    View Slide

  57. THE MYTH OF VOCABULARY COVERAGE
    The 10 most common words
    account for 37% of the text
    The 100 most common words
    account for 66% of the text

    View Slide

  58. THE MYTH OF VOCABULARY COVERAGE
    if you learn top 100 words, you’ll know...
    one word in 99.9% of verses
    50% of words in 91.3% of verses
    75% of words in 24.4% of verses
    90% of words in 2.1% of verses
    95% of words in 0.6% of verses
    100% of words in 0.4% of verses

    View Slide

  59. THE MYTH OF VOCABULARY COVERAGE
    if you learn top 100 words, you’ll know...
    one word in 99.9% of verses
    50% of words in 91.3% of verses
    75% of words in 24.4% of verses
    90% of words in 2.1% of verses
    95% of words in 0.6% of verses
    100% of words in 0.4% of verses

    View Slide

  60. VOCABULARY ORDERING
    •frequency ordering is far from optimal
    •vocabulary ordering as traveling salesman problem
    •simulated annealing

    View Slide

  61. def calc_score(self, target_list):
    known_items = set()
    score = 0.0
    num_targets = float(len(target_list))
    for step, target in enumerate(target_list):
    to_learn = target.prereqs - known_items
    score += len(to_learn) * step / num_targets
    known_items.update(to_learn)
    return score
    SIMULATED ANNEALING

    View Slide

  62. while temp > final_temp:
    for i in range(iterations):
    s1 = self.scorer.calc_score(target_list)
    p1 = randrange(0, num_targets)
    p2 = randrange(0, num_targets)
    new_list = self.swap(target_list, p1, p2)
    s2 = self.scorer.calc_score(new_list)
    if s2 > s1:
    target_list = new_list
    else:
    if random() < exp((s2 - s1) / temp):
    target_list = new_list
    temp = temp * alpha

    View Slide

  63. # a dictionary mapping targets to a set of items that are
    # needed (and initially missing)
    MISSING_IN_TARGET = defaultdict(set)
    ...
    # for each item, a score of how bad it is that it is missing
    MISSING_ITEMS = defaultdict(int)
    for missing in MISSING_IN_TARGET.values():
    for item in missing:
    MISSING_ITEMS[item] += 1. / (2 ** len(missing))
    if not MISSING_ITEMS:
    break
    next_item = sorted(MISSING_ITEMS, key=MISSING_ITEMS.get)[-1]
    for target in TARGETS_MISSING[next_item]:
    MISSING_IN_TARGET[target].remove(next_item)

    View Slide

  64. OTHER WORK AND FUTURE PLANS
    •consideration of other prerequisite knowledge
    •intrinsic difficulty of a word (cognates, etc)
    •partial dynamic interlinears
    •inline replacement

    View Slide

  65. CLEESE
    AN OPERATING SYSTEM IN PYTHON

    View Slide

  66. THE BASIC IDEA
    •run the Python VM on bare metal
    •remove parts of CPython that assume an operating system
    •minimize libc
    •add built-ins to directly access memory and I/O ports
    •write drivers, etc in Python
    •even “must be written in assembly” parts use python-like syntax

    View Slide

  67. static PyMethodDef ports_methods[] = {
    ! {"inb", ports_inb, METH_VARARGS, NULL},
    ! {"outb", ports_outb, METH_VARARGS, NULL},
    ! {NULL, NULL},
    };
    PyObject *
    _Ports_Init(void)
    {
    ! return Py_InitModule4("ports", ports_methods,
    ! ! NULL, (PyObject *)NULL, PYTHON_API_VERSION);
    }

    View Slide

  68. static PyObject *
    ports_outb(PyObject *self, PyObject *args)
    {
    ! PyObject *v, *a;
    ! if(!PyArg_UnpackTuple(args, "outb", 2, 2, &v, &a))
    ! ! return NULL;
    ! if(!PyInt_CheckExact(a))
    ! ! return NULL;
    ! if(PyInt_CheckExact(v))
    ! ! outb(PyInt_AS_LONG(a),PyInt_AS_LONG(v));
    ! else if(PyString_Check(v) && PyString_GET_SIZE(v) == 1)
    ! ! outb(PyInt_AS_LONG(a), PyString_AS_STRING(v)[0]);
    ! else
    ! ! return NULL;
    !
    ! Py_INCREF(Py_True);
    ! return Py_True;
    }

    View Slide

  69. import ports
    def on(freq):
    if freq:
    ports.outb(freq, 0x42)
    ports.outb(freq >> 8, 0x42)
    ports.outb(ports.inb(0x61) | 0x03, 0x61)
    else:
    off()
    def off():
    ! ports.outb(inb(0x61) & 0xFC, 0x61)
    SPEAKER

    View Slide

  70. import ports
    def get_scancode():
    while not (ports.inb(0x64) & 0x21) == 0x01:
    pass
    return ports.inb(0x60)
    KEYBOARD

    View Slide

  71. TWO APPROACHES TO VM
    •start with full CPython and remove bits as you hit problems
    •start with nothing and add just the CPython bits you need

    View Slide

  72. FUTURE PLANS
    •proper memory management
    •less task-specific subsetting of CPython
    •PyPy instead of CPython?

    View Slide

  73. APPLEPY
    AN APPLE ][ EMULATOR

    View Slide

  74. self.ops[0xA8] = lambda: self.TAY()
    self.ops[0xA9] = lambda: self.LDA(self.immediate_mode())
    self.ops[0xAA] = lambda: self.TAX()
    self.ops[0xAC] = lambda: self.LDY(self.absolute_mode())
    self.ops[0xAD] = lambda: self.LDA(self.absolute_mode())

    View Slide

  75. def immediate_mode(self):
    return self.get_pc()
    def absolute_mode(self):
    self.cycles += 2
    return self.read_pc_word()

    View Slide

  76. def LDA(self, operand_address):
    self.accumulator =
    self.update_nz(
    self.read_byte(operand_address))

    View Slide

  77. def update_nz(self, value):
    value = value % 0x100
    self.zero_flag = [0, 1][(value == 0)]
    self.sign_flag = [0, 1][((value & 0x80) != 0)]
    return value

    View Slide

  78. class SoftSwitches:
    ...
    def read_byte(self, cycle, address):
    assert 0xC000 <= address <= 0xCFFF
    if address == 0xC000:
    return self.kbd
    elif address == 0xC010:
    self.kbd = self.kbd & 0x7F
    elif address == 0xC030:
    if self.speaker:
    self.speaker.toggle(cycle)

    View Slide

  79. THE REAL CHALLENGES
    •character generator
    •cassette interface
    •speaker
    •graphics
    •(disk)

    View Slide

  80. OTHER THINGS
    •Forth Interpreter
    •Pascal Interpreter
    (as a prolegomenon to even more insanity)
    •Relational Algebra Implementation
    •Quantum Computing
    •Unicode Collation Algorithm Implementation
    •Global Illumination Renderer
    •Hacking Data Files from Lord of the Rings Online and Skyrim

    View Slide

  81. jtauber.github.com
    jtauber.com
    @jtauber

    View Slide