$30 off During Our Annual Pro Sale. View Details »

Discovering Python

Discovering Python

Presentation. PyCon 2014. Montreal. Conference video at https://www.youtube.com/watch?v=RZ4Sn-Y7AP8

David Beazley

April 11, 2014
Tweet

More Decks by David Beazley

Other Decks in Programming

Transcript

  1. Discovering Python
    David Beazley (@dabeaz)
    http:/
    /www.dabeaz.com
    PyCon'2014 Montreal

    View Slide

  2. In 2005...
    ... I was hired to go look at 1.5 TB
    (yes, that's Terabytes) of source
    code sitting in a secret vault.

    View Slide

  3. Six Years Later...
    I testified in US district court about:
    - Concurrency
    - Threads
    - Event loops
    - Interrupts
    Good god!

    View Slide

  4. Discovering with Python
    (or what happens when Python is brought
    into the ring of a legal battle)

    View Slide

  5. Disclaimer
    Everything in this talk actually happened
    Names and details have been changed
    Non-disclosure (I'd have to kill you)
    All exhibits/photos are fictional
    I know nothing, you'll learn nothing

    View Slide

  6. Meet Alice

    View Slide

  7. Alice
    Meet Bob

    View Slide

  8. Alice
    Bob
    "No, I'll send YOU a message!"

    View Slide

  9. Alice
    Bob
    Bob's Attorney

    View Slide

  10. Alice
    Bob
    Bob's Attorney
    "Bwhahahaha!"
    Patent
    Infringement

    View Slide

  11. Alice
    Bob
    Bob's Attorney
    "Prepare to die!"
    Alice's Attorney

    View Slide

  12. Alice
    Bob
    Bob's Attorney
    "Prepare to die!"
    Alice's Attorney
    "Bring it!"

    View Slide

  13. Let's Talk Patents
    A hot-button issue
    Myth: All patent lawsuits are trolls
    Myth: All patent lawsuits involving
    software are purely about software
    Fact: Patent litigation is hell

    View Slide

  14. Patent Litigation Timeline
    You hear about patents a lot
    But what actually happens?
    This talk is about that!
    Initial
    Complaint
    Fact Discovery
    (9-12 months)
    Claim
    Construction
    Summary
    Judgement
    Trial

    View Slide

  15. Fact Discovery
    Bob
    "Obvious
    Infringement"

    View Slide

  16. Fact Discovery
    Alice
    "Obviously Different"

    View Slide

  17. Fact Discovery
    Bob's Attorney Alice's Attorney
    Facts

    View Slide

  18. "Just the facts, ma'am"
    Enter: Fact Expert
    Technical expert
    Unbiased party
    Privileged
    Works with legal

    View Slide

  19. Reality
    Bob's Attorney
    Bob's Coworkers
    Fact Expert

    View Slide

  20. The Team
    Bob's Attorney
    Bob's Coworkers
    Fact Expert
    Me

    View Slide

  21. What Happens
    You are dropped into a
    firestorm
    No technical guidance
    Because no one knows
    anything... that's why
    they called you!

    View Slide

  22. Quick Learning
    The Invention

    View Slide

  23. Quick Learning
    The Invention
    7. The system of claim 5 or 6, wherein the display and
    input means comprises displays means and input means,
    the input means being connected to the central processing
    unit, the display means being connected to the slithering
    means and the central processing unit, the display means
    being arranged to display the displays and the input means
    transferring the input responses to the central processing
    unit, and wherein the display and input task means further
    comprises display task means and input task means, the
    display task means being arranged to control the display
    means by transferring display commands to, and receiving
    the display responses from, the display means, the input
    task means being arranged to control the input means by
    transferring input commands to, and receiving input
    responses from, the input means.
    The Patent

    View Slide

  24. Quick Learning
    The Invention
    7. The system of claim 5 or 6, wherein the display and
    input means comprises displays means and input means,
    the input means being connected to the central processing
    unit, the display means being connected to the slithering
    means and the central processing unit, the display means
    being arranged to display the displays and the input means
    transferring the input responses to the central processing
    unit, and wherein the display and input task means further
    comprises display task means and input task means, the
    display task means being arranged to control the display
    means by transferring display commands to, and receiving
    the display responses from, the display means, the input
    task means being arranged to control the input means by
    transferring input commands to, and receiving input
    responses from, the input means.
    The Patent

    View Slide

  25. Invention has some code
    - 600 pages C
    - PDF
    - 1989

    View Slide

  26. Patent Compilation
    Does the patent even work?
    Would the code compile?
    Can it be explained to others?
    You'd better find out
    How?

    View Slide

  27. Hand Compilation from PDF
    - Use highlighter

    View Slide

  28. Enter Python
    definitions = {
    450: [
    'spam',
    'grok',
    ],
    451: [
    'foo',
    ],
    452: [
    'bar',
    ]
    }
    definitions
    calls = {
    123: [
    'blah',
    'read_input',
    'send_msg',
    ],
    124: [
    'spam',
    'foo',
    'bar'
    ]
    }
    calls
    Entered by hand (from paper copy)
    A long weekend

    View Slide

  29. Just Link It
    symbols = { name: pageno
    for pageno, defns in definitions.items()
    for name in defns }
    unresolved = [ (name, pageno)
    for pageno, clist in calls.items()
    for name in clist
    if name not in symbols ]
    missing = defaultdict(list)
    for name, pageno in unresolved:
    missing[name].append(pageno)
    for item in missing.items()
    print("Missing: %s on pages %s" % item)

    View Slide

  30. Secret Weapons
    List/dict/set comprehensions
    collections module

    View Slide

  31. WHY?!?!?!?!
    Due diligence
    You'd better understand your side's invention
    Otherwise, you will die

    View Slide

  32. Meet The Enemy
    Alice

    View Slide

  33. Meet The Enemy
    Alice
    Alice's Ninja Rockstar Coders

    View Slide

  34. Alice's Ninja Rockstar Coders
    Meet The Enemy
    Alice

    View Slide

  35. Meet The Enemy
    Alice
    Alice's Adult Engineers
    SEI CMMI Level 4

    View Slide

  36. Here Are Some Documents
    500 pages

    View Slide

  37. 500 pages
    5,000 pages
    Here Are Some Documents

    View Slide

  38. 500 pages
    5,000 pages
    500,000 pages
    Here Are Some Documents

    View Slide

  39. (what's better than one? 300,000 that's what!)
    Sample Documents

    View Slide

  40. Purported Source Code?
    ATTORNEY EYES ONLY
    ATTORNEY EYES ONLY
    1677723 1677724

    View Slide

  41. From: Guido van Rossum
    Date: Dec 9 23:21:42 CET 2011
    Subject: [Python-Dev][PATCH] Adding braces to __future__
    For me, if I had to design a new language today, I
    would probably use braces, not because they're better
    than whitespace, but because pretty much every other
    lanugage uses them, and there are more interesting
    concepts to distinguish a new language. That said, I
    don't regret that Python uses indentation, and the
    rest I have to say about the topic would violate the
    above request.
    --
    --Guido van Rossum (python.org/~guido)
    Emails

    View Slide

  42. From: Guido van Rossum
    Date: Dec 9 23:21:42 CET 2011
    Subject: [Python-Dev][PATCH] Adding braces to __future__
    For me, if I had to design a new language today, I
    would probably use braces, not because they're better
    than whitespace, but because pretty much every other
    lanugage uses them, and there are more interesting
    concepts to distinguish a new language. That said, I
    don't regret that Python uses indentation, and the
    rest I have to say about the topic would violate the
    above request.
    --
    --Guido van Rossum (python.org/~guido)
    Emails
    Smoking gun?!?

    View Slide

  43. Alleged Prior Art

    View Slide

  44. Deposition of Crazy Old Guy
    Prior art

    View Slide

  45. We Have Their Software
    It's highly proprietary
    You're the only one approved to look at it
    It's actually sitting over in a vault
    AKA: Software escrow

    View Slide

  46. View Slide

  47. The Vault

    View Slide

  48. The Vault
    By the tracks

    View Slide

  49. The Vault
    By the tracks
    Rock band
    rehearsal
    space

    View Slide

  50. Vault Protocol
    No computers
    No phone
    No electronics
    No storage devices
    Pen, paper and books okay

    View Slide

  51. The Vault
    PC in a locked cage (no network) Printer
    Special paper
    Log Book

    View Slide

  52. What's There?
    A collection of large hard drives
    D:\
    E:\
    F:\
    G:\
    Each containing
    copies of CDs
    (>1.5 TB total)
    No documentation
    or organization

    View Slide

  53. Perspective
    Software archive for the
    infringing invention

    View Slide

  54. Perspective
    Software archive for the
    infringing invention
    Embedded
    Microcontroller
    System

    View Slide

  55. Perspective
    Software archive for the
    infringing invention
    Embedded
    Microcontroller
    System
    Display
    Module
    Keypads
    7-segment

    View Slide

  56. Perspective
    Software archive for the
    infringing invention
    Embedded
    Microcontroller
    System
    Display
    Module
    Keypads
    7-segment
    A PC

    View Slide

  57. Perspective
    Software archive for the
    infringing invention
    Embedded
    Microcontroller
    System
    Display
    Module
    Keypads
    7-segment
    A PC Custom PCI
    Board

    View Slide

  58. Perspective
    Software archive for the
    infringing invention
    Embedded
    Microcontroller
    System
    Display
    Module
    Keypads
    7-segment
    A PC Custom PCI
    Board
    Second PC

    View Slide

  59. Perspective
    A PC Custom PCI
    Board
    Second PC
    Custom Router
    Actually, more of a distributed system

    View Slide

  60. Perspective
    The software is "all stack"
    (a million lines of code)
    C++/
    Win32
    C/ASM
    DCOM/
    CORBA
    C/ASM
    C/ASM
    VB
    Java
    RMI
    RTOS

    View Slide

  61. Enter Time
    OS/2
    90 92 94 95 96 97 98 00 01
    WinNT
    V1 V2 V3 V4 V5 V6 V7 V8 V9
    RevA RevB RevC
    • Weekly snapshots (52 x 15 years = 780 versions)
    • Multiple hardware revisions/configurations
    • Operating system changes/deployment changes

    View Slide

  62. Enter Customers
    • Dozen major customers (corporations)
    • Customer-specific system modifications
    • Think "skins" on main system
    • Hundreds of interlocking versions
    Base System
    Version 2.51
    ACME
    Vers 1.23
    Buy N Large
    Vers 4.22 Tyrell Corp
    Vers 3.43

    View Slide

  63. Provided Tools
    Windows-XP

    View Slide

  64. Provided Tools
    Windows-XP
    Command Prompt

    View Slide

  65. Provided Tools
    Windows-XP
    Command Prompt
    Search Mutt

    View Slide

  66. Provided Tools
    Notepad

    View Slide

  67. Official Tools
    Notepad
    Visual Studio

    View Slide

  68. Printing
    You can print anything
    Must be logged
    Numbered, copied, given to opposing side

    View Slide

  69. Constraints
    No working hardware setup (can't run code)
    No working build environ (can't compile)
    No tech support (can't call anyone)
    Fragmentary documentation (if any)

    View Slide

  70. View Slide

  71. Secret Weapon

    View Slide

  72. Python? What? How?
    Unknown: How did Python get placed on
    the machine in the vault?
    I have NO idea
    A new IBM PC with only "approved tools"
    Best Guess: Used by an IBM OEM tool
    (Yet, there it was, python... in the Windows path no less).

    View Slide

  73. Desert Island Coding
    Admit it, you've probably thought Python
    might be a good choice
    Batteries included FOR THE WIN!

    View Slide

  74. Strategy
    Create a fact discovery environment from
    scratch in the vault
    I was destined for this job... I wrote the book

    View Slide

  75. Question:
    What are the objectives?
    (What does it mean to "look at" the code?)

    View Slide

  76. Goals
    What was provided? Is it complete?
    How does the code work?
    Where is the patent in the code?

    View Slide

  77. The Horror! The Horror!
    Reverse engineering the entire build environ
    Makefiles, config files, etc.
    Identifying all major software components
    Examples: .exe files, .DLLs, plugins, etc.
    Sorting out version histories

    View Slide

  78. MKDEP= mkdep
    SHELL= /bin/sh
    # === Fixed definitions ===
    OBJS= \
    bltinmodule.o \
    ceval.o cgensupport.o compile.o \
    errors.o \
    frozen.o \
    getargs.o getcompiler.o getcopyright.o getm
    getplatform.o getversion.o graminit.o \
    import.o importdl.o \
    marshal.o modsupport.o mystrtoul.o \
    pythonrun.o \
    sigcheck.o structmember.o sysmodule.o \
    traceback.o \
    $(LIBOBJS)
    LIB= libPython.a
    Sources
    Library
    You try to figure it out

    View Slide

  79. Tackle the Provided Code

    View Slide

  80. Basic Tooling
    Reimplement Unix
    find
    grep
    wc
    diff
    tail
    head
    Because that Windows search mutt must die

    View Slide

  81. Example: navigation
    import os
    def cd(dirname):
    os.chdir(dirname)
    def pwd():
    print(os.getcwd())
    def ls(dirname=''):
    os.system('dir %s' % dirname)

    View Slide

  82. Example: diff
    # diff.py
    import sys, difflib
    def diff(fromfile, tofile):
    fromlines = open(fromfile).readlines()
    tolines = open(tofile).readlines()
    diff = difflib.context_diff(fromlines, tolines,
    fromfile, tofile)
    sys.stdout.writelines(diff)

    View Slide

  83. Interactive Shell
    >>> cd('pycode')
    >>> pwd()
    D:\Files\pycode
    >>> diff('Python-2.6/Lib/collections.py',
    ... 'Python-2.6.2/Lib/collections.py')
    *** Python-2.6/Lib/collections.py
    --- Python-2.6.2/Lib/collections.py
    ***************
    *** 103,109 ****
    # where the named tuple is created. Bypass this step in
    where
    # sys._getframe is not defined (Jython for example).
    if hasattr(_sys, '_getframe'):
    ! result.__module__ = _sys._getframe(1).f_globals['__n
    return result
    --- 103,109 ----
    # where the named tuple is created. Bypass this step in
    where

    View Slide

  84. More Than Reinvention
    Actually implementing an entire workflow
    Building up layers of tools/analyses
    Not unlike what is done with IPython NB
    Can't understate Python awesomeness

    View Slide

  85. Example
    def allfiles(topdir):
    return ((path, filename)
    for path, dirs, files in os.walk(topdir)
    for filename in files)
    >>> files = allfiles('AllPython')
    >>> next(files)
    ('AllPython/0/python-0.9.1', 'python.man')
    >>> next(files)
    ('AllPython/0/python-0.9.1', 'README')
    >>>

    View Slide

  86. Example
    def filetypes(topdir):
    from collections import Counter
    from pprint import pprint
    c = Counter(os.path.splitext(name)[1]
    for _, name in allfiles(topdir))
    pprint(c.most_common())
    >>> filetypes('AllPython')
    [('.py', 125277),
    ('.c', 27200),
    ('', 17010),
    ('.rst', 15439),
    ('.h', 14782),
    ('.tex', 12257),
    ...
    allfiles()

    View Slide

  87. Example
    def find(topdir, pattern):
    from fnmatch import fnmatch
    return ((path, name)
    for path, name in allfiles(topdir)
    if fnmatch(name, pattern))
    >>> f = find('AllPython', '*.py')
    >>> next(f)
    ('AllPython/0/python-0.9.1/demo/scripts', 'findlinksto.py'
    >>> next(f)
    ('AllPython/0/python-0.9.1/demo/scripts', 'mkreal.py')
    >>> next(f)
    ('AllPython/0/python-0.9.1/demo/scripts', 'ptags.py')
    >>>
    allfiles()
    filetypes()

    View Slide

  88. Example
    def create_versions(topdir):
    import re
    for path, _ in find(topdir, 'pgen.c'):
    pypath, _ = os.path.split(path)
    version = re.search(r'-(\w+\.\w+(\.\w+)?)$',
    pypath).group(1)
    yield version, pypath
    allfiles()
    filetypes() find()

    View Slide

  89. Example
    >>> vers = find_versions('AllPython')
    >>> next(vers)
    ('0.9.1', 'AllPython/0/python-0.9.1')
    >>> next(vers)
    ('1.0.1', 'AllPython/1/python-1.0.1')
    >>>
    allfiles()
    filetypes() find()
    find_versions()

    View Slide

  90. Example
    def write_manifest(topdir):
    import csv
    f = open('manifest.csv','w')
    csv.writer(f).writerows(find_versions(topdir))
    f.close()
    allfiles()
    filetypes() find()
    find_versions()

    View Slide

  91. Example
    allfiles()
    filetypes() find()
    find_versions()
    write_manifest()

    View Slide

  92. Example
    allfiles()
    filetypes() find()
    find_versions()
    write_manifest()
    .csv
    Workflows!

    View Slide

  93. Pile it Higher and Higher
    HDDs
    Snapshots
    "Virtual File System"
    View View View
    You keep
    building
    abstractions
    Reorganized
    file layer
    Different
    views (version,
    date, prod,
    debug, etc.)
    .csv

    View Slide

  94. Example: Versioning
    def versions(filename):
    import hashlib
    from collections import defaultdict
    manifest = read_manifest()
    groups = defaultdict(list)
    for vers, path in manifest.items():
    fullname = os.path.join(path, filename)
    if os.path.exists(fullname):
    digest = hashlib.new('md5')
    digest.update(open(fullname,'rb').read())
    groups[digest.digest()].append(vers)
    return sorted([sorted(g) for g in groups.values()])

    View Slide

  95. Example: Versioning
    >>> for x in versions('Python/thread.c'):
    ... print(x)
    ...
    ['1.0.1']
    ['1.1']
    ['1.2', '1.3']
    ['1.4']
    ['1.5', '1.5.1']
    ['1.5.2', '1.5.2c1']
    ['1.5.2b1', '1.5.2b2']
    ['1.6', '1.6b1']
    ['2.0', '2.0.1', '2.0c1', '2.1', '2.1.1', '2.1.2',
    '2.1.3']
    ...

    View Slide

  96. Navigational Tooling
    "Virtual File System"
    View View View
    Query tools for
    going to any
    version/file
    Navigational Tools
    >>> view('2.7.3', 'Python/ceval.c')
    >>>
    Typically launch windows tools (e.g., Vis Studio)

    View Slide

  97. Timeline/Inventory Tools
    Link together every version of every
    component found
    Development timelines
    Official vs. Debug releases
    V1
    V2
    V3
    V4
    release
    release
    release
    release
    release
    release
    release

    View Slide

  98. Commentary
    I don't know if the opposing side actually
    expected us to figure out their code
    We knew almost everything about everything
    Python FOR.THE.WIN.

    View Slide

  99. How Does Code Work?
    Better make sure you understand
    everything about the code
    Software architecture
    Interaction between components
    Underlying algorithms

    View Slide

  100. Problem: Code Sucks
    Nobody wants to read code
    Better: Design documents, specs
    Nobody wants to give you that
    "Go read the source."

    View Slide

  101. Let's Go Fishing
    Interesting files
    Code comments
    TPS reports
    PDF, DOC, RTF, HTML, TXT
    /
    / See: Important Document
    Fixed bug. See important
    specification.
    I ὑ re

    View Slide

  102. Back and Forth
    An obscure find
    /* See FS-6541-8v2.0 for details */
    A request to attorneys
    "Tell opposing counsel we can't find FS-6541-8v2.0"
    A few silent days pass....

    View Slide

  103. View Slide

  104. View Slide

  105. View Slide

  106. Casting a Wide Net
    Search for documents is far and wide
    Software change notices
    Unrelated software (peripheral devices)
    Emails
    The web (catalogs, manuals, job postings, etc.)
    Analogy: Pulling on a loose thread...

    View Slide

  107. Commentary
    You're learning the invention from scratch
    Reading other people's code
    You're teaching attorneys about it
    The other side doesn't want you to succeed
    You will learn A LOT in this exercise

    View Slide

  108. Some Lessons Learned
    SUCKS ROCKS

    View Slide

  109. Some Lessons Learned
    SUCKS ROCKS
    C++ Assembly code

    View Slide

  110. Some Lessons Learned
    SUCKS ROCKS
    C++ Assembly code
    Asynchronous Threads

    View Slide

  111. Some Lessons Learned
    SUCKS ROCKS
    C++ Assembly code
    Asynchronous Threads
    Objects Functions

    View Slide

  112. Some Lessons Learned
    SUCKS ROCKS
    C++ Assembly code
    Asynchronous Threads
    Objects Functions
    Makefiles
    IDEs

    View Slide

  113. Some Lessons Learned
    SUCKS ROCKS
    C++ Assembly code
    Asynchronous Threads
    Objects Functions
    Makefiles
    IDEs
    CASE Tools Humans

    View Slide

  114. Some Lessons Learned
    SUCKS ROCKS
    C++ Assembly code
    Asynchronous Threads
    Objects Functions
    Makefiles
    IDEs
    CASE Tools Humans
    UML Words

    View Slide

  115. Some Lessons Learned
    SUCKS ROCKS
    C++ Assembly code
    Asynchronous Threads
    Objects Functions
    Makefiles
    IDEs
    CASE Tools Humans
    1990s 1970s
    UML Words

    View Slide

  116. Some Lessons Learned
    SUCKS ROCKS
    C++ Assembly code
    Asynchronous Threads
    Objects Functions
    Makefiles
    IDEs
    CASE Tools Humans
    1990s 1970s
    (of course this is just my opinion, I could be wrong)
    UML Words

    View Slide

  117. Speaking of Attorneys
    Do the "facts" support patent infringement?
    Does it look like it infringes?
    Can it be proven that it infringes?
    (Let the game begin)

    View Slide

  118. View Slide

  119. Remember this?
    7. The system of claim 5 or 6, wherein the display and
    input means comprises displays means and input means,
    the input means being connected to the central processing
    unit, the display means being connected to the slithering
    means and the central processing unit, the display means
    being arranged to display the displays and the input means
    transferring the input responses to the central processing
    unit, and wherein the display and input task means further
    comprises display task means and input task means, the
    display task means being arranged to control the display
    means by transferring display commands to, and receiving
    the display responses from, the display means, the input
    task means being arranged to control the input means by
    transferring input commands to, and receiving input
    responses from, the input means.
    The Patent

    View Slide

  120. Remember this?
    7. The system of claim 5 or 6, wherein the display and
    input means comprises displays means and input means,
    the input means being connected to the central processing
    unit, the display means being connected to the slithering
    means and the central processing unit, the display means
    being arranged to display the displays and the input means
    transferring the input responses to the central processing
    unit, and wherein the display and input task means further
    comprises display task means and input task means, the
    display task means being arranged to control the display
    means by transferring display commands to, and receiving
    the display responses from, the display means, the input
    task means being arranged to control the input means by
    transferring input commands to, and receiving input
    responses from, the input means.
    The Patent
    What does this claim mean?
    (let's rumble)

    View Slide

  121. View Slide

  122. Defining Claim Terms
    7. The system of claim 5 or 6, wherein the display and
    input means comprises displays means and input means,
    the input means being connected to the central processing
    unit, the display means being connected to the slithering
    means and the central processing unit, the display means
    being arranged to display the displays and the input means
    transferring the input responses to the central processing
    unit, and wherein the display and input task means further
    comprises display task means and input task means, the
    display task means being arranged to control the display
    means by transferring display commands to, and receiving
    the display responses from, the display means, the input
    task means being arranged to control the input means by
    transferring input commands to, and receiving input
    responses from, the input means.
    The Patent
    Term Plaintiff Defendant
    central
    processing
    unit
    display means

    View Slide

  123. Claim Construction
    Claim terms have to be supported by reality
    If not, it's game over
    A lot of attorney/expert consultation
    Problem: very specific facts and structure
    File: Widget/foo.c, lines 230-255.
    Requires a deep dive

    View Slide

  124. Problem
    Matching claims to 800 versions of a million
    line program
    Pick one version?
    Which one?
    Match them all?

    View Slide

  125. Fragment Versioning
    You're familiar with source code control
    Imagine applying it to code fragments/excerpts
    In reverse
    Hmmm.

    View Slide

  126. /* source.c */
    void grok() {
    if (spam) {
    foo();
    bar();
    }
    ...
    }
    void blah() {
    ...
    }

    View Slide

  127. /* source.c */
    void grok() {
    if (spam) {
    foo();
    bar();
    }
    ...
    }
    void blah() {
    ...
    }
    file: source.c
    start: 'void grok()'
    end: 'void blah()'
    Fragment

    View Slide

  128. /* source.c */
    void grok() {
    if (spam) {
    foo();
    bar();
    }
    ...
    }
    void blah() {
    ...
    }
    file: source.c
    start: 'void grok()'
    end: 'void blah()'
    Fragment
    Snapshots (>800)
    Global fragment search across all versions

    View Slide

  129. /* source.c */
    void grok() {
    if (spam) {
    foo();
    bar();
    }
    ...
    }
    void blah() {
    ...
    }
    file: source.c
    start: 'void grok()'
    end: 'void blah()'
    Fragment
    Snapshots (>800)
    Ver1 Ver2 Ver3
    void grok() {
    if (spam) {
    foo();
    bar();
    }
    }
    void blah() {
    void grok() {
    if (spam) {
    foo();
    bar(x);
    }
    }
    void blah() {
    void grok() {
    if (spam) {
    new_foo();
    bar(x);
    }
    }
    void blah() {

    View Slide

  130. Big Picture
    Reduce a massive data set to something sane
    "This claim matches this structure in the code.
    There have only been six versions of this code over
    15 years. Here are the six versions."
    Keep in mind: All this happening in the vault
    Big collection of fragment histories

    View Slide

  131. PYTHON PYTHON PYTHON
    PYTHON PYTHON PYTHON
    PYTHON PYTHON PYTHON
    PYTHON PYTHON PYTHON
    PYTHON PYTHON PYTHON
    PYTHON PYTHON PYTHON

    View Slide

  132. PYTHON PYTHON PYTHON
    PYTHON PYTHON PYTHON
    PYTHON PYTHON PYTHON
    PYTHON PYTHON PYTHON
    PYTHON PYTHON PYTHON
    PYTHON PYTHON PYTHON
    Python makes the impossible possible

    View Slide

  133. PYTHON PYTHON PYTHON
    PYTHON PYTHON PYTHON
    PYTHON PYTHON PYTHON
    PYTHON PYTHON PYTHON
    PYTHON PYTHON PYTHON
    PYTHON PYTHON PYTHON
    Python makes the impossible possible
    (even Python 3)

    View Slide

  134. Final Thoughts
    If you get the chance to do this, do it!
    You will learn A LOT!
    Would I want to do it again? Not sure.

    View Slide

  135. But How Did it End?

    View Slide

  136. View Slide

  137. My End Game
    I learned a lot about generator functions
    Ultimately a well-known PyCon tutorial...

    View Slide

  138. Postscript: Expert Report
    You may be asked to write an expert report
    Outlines all factual findings
    Ties facts to patent claims
    A scientific document
    It's a document that WILL be read

    View Slide

  139. Postscript: Deposition
    You
    A room of attorneys
    Opposing expert
    Court reporter
    Videographer
    8 hours
    It will be one of the most intense,
    surreal, awesome/worst
    experiences of your whole life.

    View Slide

  140. Postscript: Court Testimony
    Like deposition, but dialed up to 11
    Twice as many attorneys, more experts
    Judge & clerks

    View Slide

  141. Questions

    View Slide