$30 off During Our Annual Pro Sale. View Details »

Brian Corbin - Accelerating healthcare transactions with Python and PyPy

Brian Corbin - Accelerating healthcare transactions with Python and PyPy

Python is well suited for many file processing tasks. However, Python is an uncommon
language choice for many organizations that need to process large files of
healthcare transactions. This talk will share lessons we've learned
processing healthcare transactions in Python running on PyPy.

https://us.pycon.org/2016/schedule/presentation/1917/

PyCon 2016

May 29, 2016
Tweet

More Decks by PyCon 2016

Other Decks in Programming

Transcript

  1. Accelerating Healthcare
    Transactions with Python and PyPy
    Brian Corbin

    View Slide

  2. 2
    def accelerate_healthcare():

    ”””

    PyCon 2016 talk

    ”””

    with my_road_to_pycon() as ctx:
    ctx.healthcare_x12_standards()
    ctx.build_quickly_in_python()
    ctx.run_fast_with_pypy()

    View Slide

  3. My to PyCon 2016
    3

    View Slide

  4. 4

    View Slide

  5. “Ignorance breeds Ignorance”
    5

    View Slide

  6. 6

    View Slide

  7. rural computing
    7

    View Slide

  8. hub-ology.org
    Hub City Python User Group
    8

    View Slide

  9. 9

    View Slide

  10. December 2011

    View Slide

  11. 11
    Theodore Tanner Jr. 

    Founder, CTO
    Apple

    Microsoft
    MongoMusic
    digiDesign
    Lisa Maki

    Founder, CEO
    Microsoft
    BeliefNetworks
    Benefitfocus

    View Slide

  12. … and now
    HEALTHCARE!!!
    12

    View Slide

  13. It impacts all of us
    13

    View Slide

  14. WHO s?!?!?
    •do I pay?
    •do you pay?
    •is there a single payer?
    •coordination of benefits!??!
    •the government?!?!
    14

    View Slide

  15. Health + Tech
    • healthcare more complicated
    than posting a photo of your
    dinner
    • difficult for established folks
    to change
    • difficult for new folks to enter
    15

    View Slide

  16. 16

    View Slide

  17. Things we can help fix…
    •make it easier to check your
    own information
    •make getting paid frictionless
    •providers
    •patient reimbursement
    •…
    17

    View Slide

  18. “The great thing
    about standards is…
    there are so many of
    them!”
    18

    View Slide

  19. X12 Healthcare
    •270/271 eligibility 552 pages
    •276/277 claim status 288 pages
    •278 authorizations/referrals
    642 pages
    •834 enrollment 262 pages
    •835 claim payment 306 pages
    •837 claims 704 pages (professional)
    19

    View Slide

  20. 20

    View Slide

  21. The X12 Envelope
    ISA*…*T*:~

    GS*HS*000000005*12345*…~

    ST*270*0001*005010X279A1~

    .
    .
    .
    SE*13*0001~

    GE*1*1~

    IEA*1*000000907~
    21

    View Slide

  22. X12 837 (claim) snippet
    CLM*ABC123*300***11:B:1*Y*A*Y*I~
    AMT*F5*300~
    HI*ABK:Z0000~
    LX*1~
    SV1*HC:99214*70*UN*1.0***1~
    DTP*472*D8*20160513~
    LX*2~
    SV1*HC:85027*30*UN*1.0***1~
    DTP*472*D8*20160513~
    LX*3~
    22

    View Slide

  23. dictionary representation
    {
    “claim": {
    "place_of_service": "office",
    "total_charge_amount": 300.0,
    "patient_paid_amount": 300.0,
    "service_lines": [
    {
    "procedure_code": "99214",
    "charge_amount": 70.00,
    "unit_count": 1.0,
    "diagnosis_codes": [
    "Z00.00"
    ],
    23

    View Slide

  24. 24

    View Slide

  25. “That’s the way we’ve
    always done it”
    25

    View Slide

  26. 26

    View Slide

  27. Problems
    Too big to move quickly
    - like a V8 chainsaw

    Too focused on specs/talking and
    not basic functionality
    - like pulling weeds all day in one
    corner of a large field
    27

    View Slide

  28. Tractors and weed
    eaters
    28

    View Slide

  29. 29

    View Slide

  30. 30

    View Slide

  31. 31





    View Slide

  32. Python: The diamond
    plate truck toolbox
    32

    View Slide

  33. 33

    View Slide

  34. Python Language Benefits
    •useful for large and small projects
    •standard library
    •decorators
    •generators
    •context managers
    •easy to tote around
    34

    View Slide

  35. How to with
    Python?
    35

    View Slide

  36. Interactive all day
    $ python
    Python 2.7.10 (3260adbeba4a, Apr 19 2016,
    13:10:19)
    [PyPy 5.1.0 with GCC 4.2.1 Compatible Apple
    LLVM 5.1 (clang-503.0.40)] on darwin
    Type "help", "copyright", "credits" or "license" for
    more information.
    >>>>
    36

    View Slide

  37. The Workflow
    • Interactive evaluations become
    snippets and tests
    • Snippets become a library of
    functions
    • Ship it and circle back for more
    37

    View Slide

  38. How to weed eat?
    38

    View Slide

  39. The Workflow
    • Reduce any remaining code
    duplication from 1st round of ing
    • Functions may become classes
    • Optimize based on real world metrics
    • Call the back if another pass is
    needed
    39

    View Slide

  40. X12 processing with
    Python
    40

    View Slide

  41. Context Managers


    with X12File('test.270', **kw) as x12:
    with X12FunctionalGroup(x12, **kw) as fg:
    with X12TransactionSet(fg) as ts:
    # generate segment data and then 

    # write out the segment id, data
    ts.write_segment(id, data)
    41

    View Slide

  42. Decorators
    @matches_segment('EB', conditions={'EB01': 'C'})
    def eligibility_deductible_rule(self, segment, data):
    """
    Maps a deductible eligibility benefit segment 

    to a dictionary/model.
    :param segment: The current segment being
    processed.
    :param data: a dictionary of data from the current
    segment.
    """
    pass
    42

    View Slide

  43. Generators
    import pokitedi
    #parse X12 data and yield dictionaries containing
    #segment data
    for segment in pokitedi.x12.stream('example.271'):
    print(segment)
    #parse X12 data and yield dictionaries 

    #representing transaction models
    segment_stream = pokitedi.x12.stream('example.271')
    for model in pokitedi.x12.model(segment_stream):
    print(model.coverage.active)
    43

    View Slide

  44. import csv
    import csv
    def models(csv_filename):
    with open(csv_filename, 'r') as csv_file:
    csv_reader = csv.DictReader(csv_file)
    for row in csv_reader:
    model = map_record(row)
    yield model
    with open('example.271', 'w') as x12_file:
    for segment in pokitedi.x12.stream(models('test.csv')):
    x12_file.write(segment)
    44

    View Slide



  45. …and now buckle your
    safety belts
    45

    View Slide

  46. …but my momma’s
    brother’s neighbor’s
    coworker’s best friend
    said Python was too
    slow and you can’t
    scale…
    46

    View Slide

  47. How to make it go ?
    47

    View Slide

  48. PyPy (put some gas in it)
    The becomes a with PyPy’s
    Just-in-Time (JIT) Compiler
    In most cases, code just runs (except
    a lot faster)
    48

    View Slide

  49. …but XYZ doesn’t work on PyPy
    49

    View Slide

  50. The main event: 

    and PyPy vs.
    ignorance
    50

    View Slide

  51. National Provider Identifier (NPI)
    unique 10-digit id number for health
    care providers in the United States
    Used frequently throughout X12
    transactions
    Example: 1467857193 (me)
    51

    View Slide

  52. NPI checksum
    •Luhn algorithm
    •used to validate , NPIs, Canadian
    Social Insurance Numbers, …
    •designed to catch accidental errors
    •See https://en.wikipedia.org/wiki/
    Luhn_algorithm for more information
    52

    View Slide

  53. Some Python
    def digits_of(number):
    return [int(digit) for digit in str(number)]
    def luhn_checksum(card_number):
    digits = digits_of(card_number)
    odd_digits = digits[-1::-2]
    even_digits = digits[-2::-2]
    total = sum(odd_digits)
    for digit in even_digits:
    total += sum(digits_of(2 * digit))
    return total % 10
    def is_luhn_valid(card_number):
    return luhn_checksum(card_number) == 0
    53

    View Slide

  54. NPI validation with Python
    $ python2.7 -m timeit -s "from wiki_luhn
    import is_luhn_valid"
    "is_luhn_valid('808401467857193')"
    100000 loops, best of 3: 9.74 usec per loop

    $ python3.5 -m timeit -s "from wiki_luhn
    import is_luhn_valid"
    "is_luhn_valid('808401467857193')"
    100000 loops, best of 3: 9.59 usec per loop

    $ pypy -m timeit -s "from wiki_luhn import
    is_luhn_valid"
    "is_luhn_valid('808401467857193')"
    1000000 loops, best of 3: 1.54 usec per loop
    54

    View Slide

  55. Another approach in Python
    even_luhn_conversion = [0,2,4,6,8,1,3,5,7,9]
    def luhn_odd(number):
    return 0 if number == 0 else
    luhn_even(number // 10) + number % 10
    def luhn_even(number):
    return 0 if number == 0 else luhn_odd(number
    // 10) + even_luhn_conversion[number % 10]
    def luhn(number):
    return luhn_odd(int(number)) % 10
    def is_luhn_valid(number):
    return luhn(number) == 0
    55

    View Slide

  56. Refined NPI validation with Python
    $ python2.7 -m timeit -s "from refined_luhn
    import is_luhn_valid"
    "is_luhn_valid('808401467857193')"
    100000 loops, best of 3: 2.21 usec per loop

    $ python3.5 -m timeit -s "from refined_luhn
    import is_luhn_valid"
    "is_luhn_valid('808401467857193')"
    100000 loops, best of 3: 3.54 usec per loop

    $ pypy -m timeit -s "from refined_luhn import
    is_luhn_valid"
    "is_luhn_valid('808401467857193')"
    1000000 loops, best of 3: 0.501 usec per loop
    56

    View Slide

  57. How long to
    validate all NPIs?
    57

    View Slide

  58. Validate all the NPIs (Python)
    from refined_luhn import is_luhn_valid
    def validate_all_npi_values():
    row_count = 0
    with open('/npi/npidata.csv', 'r') as npi_file:
    for row in npi_file:
    if row_count > 0:
    fields = row.split(',')
    npi = '80840' + fields[0][1:-1]
    is_luhn_valid(npi)
    row_count += 1
    if __name__ == "__main__":
    validate_all_npi_values()
    58

    View Slide

  59. Validate all the NPIs (Java) 

    [ran out of slide space…]
    import java.io.BufferedReader;
    import java.io.FileReader;
    import java.io.IOException;
    import
    org.apache.commons.validator.routines.checkdigit.LuhnCheckDigit;
    public class CrunchNPI
    {
    public static void main(String[] args)
    {
    LuhnCheckDigit luhn = new LuhnCheckDigit();
    //Input file which needs to be parsed
    String fileToParse = "/npi/npidata.csv";
    BufferedReader fileReader = null;
    //Delimiter used in CSV file
    final String DELIMITER = ",";
    try
    59

    View Slide

  60. ALL THE NPIs! (about 4.8M of them)
    $ time java CrunchNPI
    real 0m44.957s
    user 0m41.870s
    sys 0m2.730s
    $ time pypy crunch_npi.py
    real 0m39.302s
    user 0m35.560s
    sys 0m1.950s
    60

    View Slide

  61. Algorithms still
    matter
    61

    View Slide

  62. Take it to the house
    •Interactive development + a JIT
    helps you and
    •Python and PyPy are one way to
    achieve this powerful combination
    •Folks need to be open minded about
    different approaches that produce
    results
    62

    View Slide

  63. Brett Cannon’s great talk 

    “An unscientific survey of Python
    interpreters” & Pyjion
    http://nbviewer.jupyter.org/gist/brettcannon/9d19cc184ea45b3e7ca0
    https://github.com/Microsoft/Pyjion
    Luhn Algorithm
    https://en.wikipedia.org/wiki/Luhn_algorithm
    Accelerating Healthcare Companion
    https://github.com/corbinbs/accelerating-healthcare-python-pypy
    Some References
    63

    View Slide

  64. Thank y’all
    64

    View Slide