Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Brian Corbin - Accelerating healthcare transactions with Python and PyPy

Brian Corbin - Accelerating healthcare transactions with Python and PyPy

Python is well suited for many file processing tasks. However, Python is an uncommon
language choice for many organizations that need to process large files of
healthcare transactions. This talk will share lessons we've learned
processing healthcare transactions in Python running on PyPy.

https://us.pycon.org/2016/schedule/presentation/1917/

PyCon 2016

May 29, 2016
Tweet

More Decks by PyCon 2016

Other Decks in Programming

Transcript

  1. 2 def accelerate_healthcare():
 ”””
 PyCon 2016 talk
 ”””
 with my_road_to_pycon()

    as ctx: ctx.healthcare_x12_standards() ctx.build_quickly_in_python() ctx.run_fast_with_pypy()
  2. 4

  3. 6

  4. 9

  5. 11 Theodore Tanner Jr. 
 Founder, CTO Apple
 Microsoft MongoMusic

    digiDesign Lisa Maki
 Founder, CEO Microsoft BeliefNetworks Benefitfocus
  6. WHO s?!?!? •do I pay? •do you pay? •is there

    a single payer? •coordination of benefits!??! •the government?!?! 14
  7. Health + Tech • healthcare more complicated than posting a

    photo of your dinner • difficult for established folks to change • difficult for new folks to enter 15
  8. 16

  9. Things we can help fix… •make it easier to check

    your own information •make getting paid frictionless •providers •patient reimbursement •… 17
  10. X12 Healthcare •270/271 eligibility 552 pages •276/277 claim status 288

    pages •278 authorizations/referrals 642 pages •834 enrollment 262 pages •835 claim payment 306 pages •837 claims 704 pages (professional) 19
  11. 20

  12. dictionary representation { “claim": { "place_of_service": "office", "total_charge_amount": 300.0, "patient_paid_amount":

    300.0, "service_lines": [ { "procedure_code": "99214", "charge_amount": 70.00, "unit_count": 1.0, "diagnosis_codes": [ "Z00.00" ], 23
  13. 24

  14. 26

  15. Problems Too big to move quickly - like a V8

    chainsaw
 Too focused on specs/talking and not basic functionality - like pulling weeds all day in one corner of a large field 27
  16. 29

  17. 30

  18. 31

  19. 33

  20. Python Language Benefits •useful for large and small projects •standard

    library •decorators •generators •context managers •easy to tote around 34
  21. Interactive all day $ python Python 2.7.10 (3260adbeba4a, Apr 19

    2016, 13:10:19) [PyPy 5.1.0 with GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>>> 36
  22. The Workflow • Interactive evaluations become snippets and tests •

    Snippets become a library of functions • Ship it and circle back for more 37
  23. The Workflow • Reduce any remaining code duplication from 1st

    round of ing • Functions may become classes • Optimize based on real world metrics • Call the back if another pass is needed 39
  24. Context Managers 
 
 with X12File('test.270', **kw) as x12: with

    X12FunctionalGroup(x12, **kw) as fg: with X12TransactionSet(fg) as ts: # generate segment data and then 
 # write out the segment id, data ts.write_segment(id, data) 41
  25. Decorators @matches_segment('EB', conditions={'EB01': 'C'}) def eligibility_deductible_rule(self, segment, data): """ Maps

    a deductible eligibility benefit segment 
 to a dictionary/model. :param segment: The current segment being processed. :param data: a dictionary of data from the current segment. """ pass 42
  26. Generators import pokitedi #parse X12 data and yield dictionaries containing

    #segment data for segment in pokitedi.x12.stream('example.271'): print(segment) #parse X12 data and yield dictionaries 
 #representing transaction models segment_stream = pokitedi.x12.stream('example.271') for model in pokitedi.x12.model(segment_stream): print(model.coverage.active) 43
  27. import csv import csv def models(csv_filename): with open(csv_filename, 'r') as

    csv_file: csv_reader = csv.DictReader(csv_file) for row in csv_reader: model = map_record(row) yield model with open('example.271', 'w') as x12_file: for segment in pokitedi.x12.stream(models('test.csv')): x12_file.write(segment) 44
  28. PyPy (put some gas in it) The becomes a with

    PyPy’s Just-in-Time (JIT) Compiler In most cases, code just runs (except a lot faster) 48
  29. National Provider Identifier (NPI) unique 10-digit id number for health

    care providers in the United States Used frequently throughout X12 transactions Example: 1467857193 (me) 51
  30. NPI checksum •Luhn algorithm •used to validate , NPIs, Canadian

    Social Insurance Numbers, … •designed to catch accidental errors •See https://en.wikipedia.org/wiki/ Luhn_algorithm for more information 52
  31. Some Python def digits_of(number): return [int(digit) for digit in str(number)]

    def luhn_checksum(card_number): digits = digits_of(card_number) odd_digits = digits[-1::-2] even_digits = digits[-2::-2] total = sum(odd_digits) for digit in even_digits: total += sum(digits_of(2 * digit)) return total % 10 def is_luhn_valid(card_number): return luhn_checksum(card_number) == 0 53
  32. NPI validation with Python $ python2.7 -m timeit -s "from

    wiki_luhn import is_luhn_valid" "is_luhn_valid('808401467857193')" 100000 loops, best of 3: 9.74 usec per loop
 $ python3.5 -m timeit -s "from wiki_luhn import is_luhn_valid" "is_luhn_valid('808401467857193')" 100000 loops, best of 3: 9.59 usec per loop 
 $ pypy -m timeit -s "from wiki_luhn import is_luhn_valid" "is_luhn_valid('808401467857193')" 1000000 loops, best of 3: 1.54 usec per loop 54
  33. Another approach in Python even_luhn_conversion = [0,2,4,6,8,1,3,5,7,9] def luhn_odd(number): return

    0 if number == 0 else luhn_even(number // 10) + number % 10 def luhn_even(number): return 0 if number == 0 else luhn_odd(number // 10) + even_luhn_conversion[number % 10] def luhn(number): return luhn_odd(int(number)) % 10 def is_luhn_valid(number): return luhn(number) == 0 55
  34. Refined NPI validation with Python $ python2.7 -m timeit -s

    "from refined_luhn import is_luhn_valid" "is_luhn_valid('808401467857193')" 100000 loops, best of 3: 2.21 usec per loop
 $ python3.5 -m timeit -s "from refined_luhn import is_luhn_valid" "is_luhn_valid('808401467857193')" 100000 loops, best of 3: 3.54 usec per loop
 $ pypy -m timeit -s "from refined_luhn import is_luhn_valid" "is_luhn_valid('808401467857193')" 1000000 loops, best of 3: 0.501 usec per loop 56
  35. Validate all the NPIs (Python) from refined_luhn import is_luhn_valid def

    validate_all_npi_values(): row_count = 0 with open('/npi/npidata.csv', 'r') as npi_file: for row in npi_file: if row_count > 0: fields = row.split(',') npi = '80840' + fields[0][1:-1] is_luhn_valid(npi) row_count += 1 if __name__ == "__main__": validate_all_npi_values() 58
  36. Validate all the NPIs (Java) 
 [ran out of slide

    space…] import java.io.BufferedReader; import java.io.FileReader; import java.io.IOException; import org.apache.commons.validator.routines.checkdigit.LuhnCheckDigit; public class CrunchNPI { public static void main(String[] args) { LuhnCheckDigit luhn = new LuhnCheckDigit(); //Input file which needs to be parsed String fileToParse = "/npi/npidata.csv"; BufferedReader fileReader = null; //Delimiter used in CSV file final String DELIMITER = ","; try 59
  37. ALL THE NPIs! (about 4.8M of them) $ time java

    CrunchNPI real 0m44.957s user 0m41.870s sys 0m2.730s $ time pypy crunch_npi.py real 0m39.302s user 0m35.560s sys 0m1.950s 60
  38. Take it to the house •Interactive development + a JIT

    helps you and •Python and PyPy are one way to achieve this powerful combination •Folks need to be open minded about different approaches that produce results 62
  39. Brett Cannon’s great talk 
 “An unscientific survey of Python

    interpreters” & Pyjion http://nbviewer.jupyter.org/gist/brettcannon/9d19cc184ea45b3e7ca0 https://github.com/Microsoft/Pyjion Luhn Algorithm https://en.wikipedia.org/wiki/Luhn_algorithm Accelerating Healthcare Companion https://github.com/corbinbs/accelerating-healthcare-python-pypy Some References 63