Pro Yearly is on sale from $80 to $50! »

Skyfield and 15 Years of Bad APIs (Brandon Rhodes)

Skyfield and 15 Years of Bad APIs (Brandon Rhodes)

3b085ba94fee217d7656971b0cb4cf00?s=128

PyCon Canada

August 14, 2013
Tweet

Transcript

  1. Skyfield and 15 Years of Bad APIs @brandon_rhodes PyCon Canada

    August 2013
  2. Goal To reflect upon the practice of Python API design

    through my recent work on the Skyfield astronomy library
  3. None
  4. None
  5. None
  6. Elwood Charles Downey et al 1990 Ephem 1993 XEphem

  7. e·phem·er·is — A table giving the coordinates of a celestial

    body at a number of specific times
  8. C D T 1 9 : 0 0 : 0

    0 4 / 3 0 / 1 9 9 0 | L S T 8 : 1 9 : 5 0 | U T C 0 : 0 0 : 0 0 5 / 0 1 / 1 9 9 0 | | J u l i a n D a t 2 4 4 8 0 1 2 . 5 0 0 0 0 | D a w n 4 : 1 0 | W a t c h | D u s k 2 2 : 1 5 | L i s t i n g o f f | N i t e L n 5 : 5 5 | P l o t o f f | N S t e p 1 | M e n u P l a n e t D a t a | S t p S z R T C L O C K | - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - O C X R . A . D e c A z A l t H L o n g H L a t S u 2 : 3 2 . 3 1 4 : 5 8 2 7 8 : 4 0 1 2 : 3 8 2 2 0 : 2 2 M o 8 : 0 9 . 9 2 1 : 1 1 1 8 6 : 0 6 6 5 : 5 3 1 1 9 : 5 5 1 : 0 4 M e 2 : 4 9 . 4 1 7 : 3 9 2 7 7 : 4 8 1 7 : 2 6 2 1 4 : 0 8 1 : 4 3 V e 2 3 : 4 9 . 4 - 2 : 2 5 2 9 6 : 5 3 - 2 7 : 3 9 2 8 2 : 3 9 - 1 : 3 0 M a 2 2 : 3 9 . 8 - 1 0 : 0 9 3 0 8 : 1 7 - 4 4 : 1 4 2 9 7 : 5 6 - 1 : 4 3 J u 6 : 3 0 . 9 2 3 : 2 3 2 3 5 : 1 3 5 9 : 0 4 1 0 6 : 1 6 0 : 0 8 S a 1 9 : 4 9 . 6 - 2 0 : 5 3 1 7 : 2 4 - 6 5 : 1 4 2 8 9 : 4 5 0 : 1 0
  9. None
  10. © XEphem’s author reserves the right to distribute binaries, but

    allows free download of the source
  11. Missing header (.h) X 1 1 / I n t

    r i n s i c . h : N o s u c h f i l e o r d i r e c t o r y . . . $ a p t - f i l e s e a r c h X 1 1 / I n t r i n s i c . h l i b x t - d e v : / u s r / i n c l u d e / X 1 1 / I n t r i n s i c . h Missing library (-l) / u s r / b i n / l d : c a n n o t f i n d - l X e x t . . . $ a p t - f i l e s e a r c h l i b X e x t . a l i b x e x t - d e v : / u s r / l i b / i 3 8 6 - l i n u x - g n u / l i b X e x t . a
  12. None
  13. Instead of using a GUI, I wanted to write scripts

  14. Inside XEphem’s C-language source code is a computation engine called

    libastro
  15. XEphem PyEphem libastro → libastro

  16. “You have my full permission to go with what you

    have.” — Elwood’s generous reply!
  17. PyEphem = C code Wrapper around libastro

  18. 1998 Beazley’s Simple Wrapper Interface Generator (SWIG)

  19. SWIG Exposed awkward details of C structs b o d

    y = O b j ( ) b o d y . a n y . t y p e = e p h e m . P L A N E T b o d y . p l . c o d e = e p h e m . S U N e p h e m . c o m p u t e L o c a t i o n ( c i r c u m , b o d y ) p r i n t e p h e m . f o r m a t H o u r s ( o . a n y . r a , 3 6 0 0 0 ) p r i n t e p h e m . f o r m a t D e g r e e s ( o . a n y . d e c , 3 6 0 0 )
  20. 2003 Hand-written C that uses Python 2.2 superpowers

  21. Python 2.2 made attribute access easy to customize s t

    a t i c P y G e t S e t D e f b o d y _ g e t s e t [ ] = { { " r a " , g e t _ r a , 0 , " r i g h t a s c e n s i o n " } , { " d e c " , g e t _ d e c , 0 , " d e c l i n a t i o n " } , { " e l o n g " , g e t _ e l o n g , 0 , " e l o n g a t i o n " } , { " m a g " , g e t _ m a g , 0 , " m a g n i t u d e " } , ⋮ }
  22. And, I added a pure-Python wrapper on top like h

    a s h l i b , s q l i t e 3 , and s s l # e p h e m / _ _ i n i t _ _ . p y i m p o r t _ l i b a s t r o ⋮
  23. More Pythonic edition of PyEphem m a r s =

    e p h e m . M a r s ( ) m a r s . c o m p u t e ( ) p r i n t m a r s . r a , m a r s . d e c
  24. But, both interfaces were still based on C code and

    required compilation
  25. Early 2000s

  26. S u b j e c t : P y

    E p h e m W i n 3 2 b u i l d e r r o r s S u b j e c t : w i n 3 2 , d o e s P h E p h e m w o r k t h e r e t o o ? S u b j e c t : t r y i n g t o d o w n l o a d b u t I c a n t u n z i p i t
  27. Late 2000s

  28. S u b j e c t : p y

    e p h e m o n M a c P P C S u b j e c t : p y E p h e m w o n ’ t b u i l d o n S n o w L e o p a r d S u b j e c t : P y E p h e m … I n s t a l l a t i o n e r r o r i n o p e n s u s e S u b j e c t : P y E p h e m o n U b u n t u 1 0 . 1 0 S u b j e c t : P y e p h e m o n a 6 4 - b i t W i n 7 P C ?
  29. Mac Sometimes a problem, but now on MacPorts! Windows p

    y t h o n s e t u p . p y b d i s t _ w i n i n s t mingw — Open, but quirky Visual Studio Express — Works!
  30. None
  31. C extensions They are difficult to— Install Distribute Maintain

  32. I slowly grew open to the idea of an alternative

    approach
  33. An email “I’m interested in ephemeris options for astrology apps.

    You probably know about the Swiss Ephemeris. Do you know how it compares in accuracy?”
  34. libastro Predicts planetary positions using VSOP87 1987

  35. Swiss Ephemeris “based upon the DE406,” the JPL Long Ephemeris

    1997
  36. But wait! # f t p : / / s

    s d . j p l . n a s a . g o v / p u b / e p h / p l a n e t s / a s c i i / N a m e D a t e M o d i f i e d ⋮ d e 4 0 5 / 1 0 / 7 / 0 7 8 : 0 0 : 0 0 P M d e 4 0 6 / 3 / 2 2 / 1 1 8 : 0 0 : 0 0 P M ⋮ d e 4 2 1 / 2 / 6 / 1 3 7 : 0 0 : 0 0 P M d e 4 2 2 / 8 / 3 / 1 1 8 : 0 0 : 0 0 P M d e 4 2 3 / 3 / 3 0 / 1 0 8 : 0 0 : 0 0 P M ⋮
  37. DE421 More recent More accurate “planetary navigation accuracies”

  38. Wait—what? DE421 uses the “International Celestial Reference Frame”

  39. USNO Circular 179 George H. Kaplan 2005

  40. “The … resolutions passed by the International Astronomical Union at

    its General Assemblies in 1997 and 2000 are the most significant set of international agreements in positional astronomy in several decades and arguably since the Paris conference of 1896.”
  41. Time for a rewrite! Not a simple re-implementation But writing,

    in Python, a complete replacement of the old algorithms used throughout PyEphem
  42. Goals Replace entire API Integrate with scientific Python Support vector

    NumPy computations Avoid re-inventing SciPy wheels
  43. Example: Sunrise PyEphem — custom hand-written version of Newton’s Method

    New approach — an IPython Notebook that finds sunrise with s c i p y . o p t i m i z e
  44. Skyfield

  45. Tools git Jedi py.test tox

  46. py.test p y . t e s t - -

    p y a r g s s k y f i e l d \ - - d o c t e s t - g l o b = ' * . r s t '
  47. py.test a s s e r t c . c

    a l _ d a t e ( j d ) = = t i m e s c a l e s . c a l _ d a t e ( j d )
  48. I distribute tests s e t u p ( .

    . . p a c k a g e s = [ ' s k y f i e l d ' , ' s k y f i e l d . t e s t s ' ] , . . . )
  49. I distribute docs s e t u p ( .

    . . p a c k a g e _ d a t a = { ' s k y f i e l d ' : [ ' d o c u m e n t a t i o n / * . r s t ' ] } , . . . )
  50. Licensing

  51. GPL or MIT?

  52. GPL F r e e w o r l d

    C l o s e d w o r l d ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ A w e s o m e ! L e s s a w e s o m e ↑ ↑ Y o u r l i b r a r y → × A l t e r n a t i v e ? R e w r i t e ?
  53. MIT/BSD F r e e w o r l d

    C l o s e d w o r l d ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ A w e s o m e ! C l o s e d a w e s o m e ↑ ↑ Y o u r l i b r a r y → Y o u r l i b r a r y
  54. (Look closely — Open Source everywhere!) F r e e

    w o r l d C l o s e d w o r l d ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ A w e s o m e ! C l o s e d a w e s o m e ↑ ↑ Y o u r l i b r a r y → Y o u r l i b r a r y
  55. This is Python’s own model, and Python is increasingly everywhere

  56. MIT/BSD The free kind of freedom is what wins

  57. Technique

  58. single code base that runs on both Python 2 and

    Python 3
  59. Release small projects fast j p l e p h

    e m s g p 4
  60. Carefully monitor my emotions

  61. Emotions tell me whether I am programming right

  62. Frustration? Clenching teeth? Trapped?

  63. Then I might be doing it wrong

  64. Frustration often signals inadequate project structure

  65. I notice that tiny goals are easier to reach and

    keep me calm
  66. “Test-driven development is a way of managing fear during programming.”

    — Kent Beck, Test Driven Development By Example
  67. Version Control “How soon can I commit so that I

    can’t lose what I’ve just typed?”
  68. Young “How can I split this task into smaller pieces

    for the computer?” Old “How can I split this task into smaller pieces for me?”
  69. Big slog: risks everything! here →→→→→ new feature here →

    • → • → • → • → new feature Incremental: cheap to revert
  70. Always use g i t s t a s h

    to revert and try again
  71. Aim for a Moon Shot Write up an end-to-end example

    then move straight there
  72. The sins of my APIs

  73. Sin: Inscrutable names “Explicit is better than implicit” d =

    ' 2 0 1 2 / 1 1 / 9 ' m = e p h e m . m a r s ( d ) p r i n t ( m . a _ r a , m . a _ d e c ) # S k y f i e l d , i n s t e a d : d = J u l i a n D a t e ( ' 2 0 1 2 / 1 1 / 9 ' ) p = e a r t h ( d ) . o b s e r v e ( m a r s ) . a s t r o m e t r i c ( ) p r i n t ( p . r a , p . d e c )
  74. Sin: storing results on object m a r s =

    e p h e m . M a r s ( ) m a r s . c o m p u t e ( ' 2 0 1 2 / 1 1 / 9 ' ) p r i n t ( m . r a , m . d e c ) # M a r s # . n a m e # . d a t e # . c o m p u t e ( ) → ↘ # . r a ↖ ↓ # . d e c ← ← ← ← ← ← ← ↙ # ⋮
  75. # P y E p h e m : p

    o s i t i o n s = [ ] f o r d a t e i n d a t e s : m a r s . c o m p u t e ( d ) p o s i t i o n s . a p p e n d ( ( m a r s . r a , m a r s . d e c ) ) # S k y f i e l d : c o o r d s = [ m a r s ( d ) . a s t r o m e t r i c ( ) f o r d i n d a t e s ] p o s i t i o n s = [ ( c . r a , c . d e c ) f o r c i n c o o r d s ]
  76. Look familiar?

  77. l e t t e r s = l i

    s t ( s e t ( m e s s a g e _ s t r i n g ) ) l e t t e r s . s o r t ( ) p r i n t ' ' . j o i n ( l e t t e r s ) # v s p r i n t ' ' . j o i n ( s o r t e d ( s e t ( m e s s a g e _ s t r i n g ) ) )
  78. o u t p u t s = s o

    r t e d ( i n p u t s ) Making code more functional is a big part of Pythonic
  79. Sin: Concealing expense Your API is the only lifeline the

    programmer has to managing complexity and expense!
  80. # P y E p h e m m =

    e p h e m . m a r s ( ' 2 0 1 2 / 1 1 / 9 ' ) p r i n t ( m . n a m e ) # z e r o w o r k p r i n t ( m . r a , m . d e c ) # c o m p u t e d p r i n t ( m . r i s e _ t i m e ) # e x p e n s i v e !
  81. Guideline It’s okay to hide quick conveniences behind properties But

    expensive operations should always look like calls!
  82. m a r s . n a m e #

    l o o k s c h e a p m a r s . a p p a r e n t ( ) # l o o k s e x p e n s i v e ! Use this difference to train your customers!
  83. # “ I c a n u s e t

    h i s o v e r a n d o v e r a g a i n ! ” m a r s . n a m e # “ L o o k s e x p e n s i v e ; I ’ l l s a v e t h e # r e s u l t t o a l o c a l n a m e i n s t e a d . ” m a r s . a p p a r e n t ( )
  84. The big lesson?

  85. Write APIs that TEACH!

  86. Write APIs that teach You know how your API works

    and how it can best be used
  87. Share as much of that knowledge as possible

  88. Write APIs that teach While an API should hide complexity

    it should also suggest best practices
  89. Write APIs that teach r = u r l l

    i b 2 . u r l o p e n ( u r l ) # b a d r = r e q u e s t s . g e t ( u r l ) # g o o d
  90. Why? r = r e q u e s t

    s . g e t ( u r l ) The very first line of r e q u e s t s code teaches users their first HTTP verb — without their even knowing it
  91. PyEphem made every coordinate an attribute — looking exactly the

    same with no visible relationship! m a r s . a _ r a , m a r s . a _ d e c # a s t r o m e t r i c m a r s . g _ r a , m a r s . g _ d e c # a p p a r e n t g e o c e n t r i c m a r s . r a , m a r s . d e c # a p p a r e n t t o p o c e n t r i c m a r s . a l t , m a r s . a z # a p p a r e n t h o r i z o n t a l
  92. Skyfield has you to build a result from smaller operations

    h e r e = t o r o n t o ( j d ) h e r e . o b s e r v e ( m a r s ) . a s t r o m e t r i c ( ) h e r e . o b s e r v e ( m a r s ) . a p p a r e n t ( ) . e q u a t o r i a l ( ) h e r e . o b s e r v e ( m a r s ) . a p p a r e n t ( ) . h o r i z o n t a l ( )
  93. Sin: Confusing functions and methods Python support both procedural and

    object-based methods, but choosing can be difficult
  94. PyEphem m = M a r s ( d a

    t e ) p r i n t ( c o n s t e l l a t i o n ( m . r a , m . d e c ) ) # T h i s f u n c t i o n c a n O N L Y E V E R b e p a s s e d # a r i g h t a s c e n s i o n a n d d e c l i n a t i o n ; w h y # n o t m a k e i t a c o o r d i n a t e m e t h o d ?
  95. So here are some guidelines that I have been using

    lately
  96. 1. Methods constitute Python’s built-in typecheck — the one clean

    way to limit the types that a function will accept as an argument
  97. If you are tempted to start a function with i

    f i s i n s t a n c e ( … ) then you might want a method instead!
  98. 2. f ( x ) should by definition touch only

    public features of an x
  99. If f ( x ) needs internal details, then either

    make it a method itself, or create a method that does the internal manipulation for it
  100. 3. f ( x ) should by definition not mutate

    the state of x (If you need to mutate state from outside, try the Adapter Pattern!)
  101. PyEphem m = M a r s ( d a

    t e ) p r i n t m . r a # p r i n t s o n e t h i n g t o r o n t o . n e x t _ r i s i n g ( m ) p r i n t m . r a # s o m e t h i n g d i f f e r e n t !
  102. 4. Methods are discoverable (Jedi!)

  103. Skyfield: New tricks

  104. What about when you do need to dynamically compute an

    attribute?
  105. c l a s s S a m p l

    e ( o b j e c t ) : @ p r o p e r t y d e f l o u d ( s e l f ) : r e t u r n s e l f . m e s s a g e . u p p e r ( )
  106. Problem: hidden expense So, we cache

  107. c l a s s S a m p l

    e ( o b j e c t ) : _ l o u d = N o n e @ p r o p e r t y d e f l o u d ( s e l f ) : i f _ l o u d i s N o n e : s e l f . _ l o u d = s e l f . m e s s a g e . u p p e r ( ) r e t u r n s e l f . _ l o u d
  108. Problem: makes every access more expensive

  109. Solution: dunder-getattr! c l a s s S a m

    p l e ( o b j e c t ) : d e f _ _ g e t a t t r _ _ ( s e l f , n a m e ) : i f n a m e = = ' l o u d ' : s e l f . l o u d = s e l f . m e s s a g e . u p p e r ( ) r e t u r n s e l f . l o u d r a i s e A t t r i b u t e E r r o r ( )
  110. Solution: dunder-getattr For a value frequently accessed in heavy numeric

    work, _ _ g e t a t t r _ _ ( ) runs only on the first lookup!
  111. Trick: scalar style NumPy initially excited me by letting me

    write scalar code that also works on arrays!
  112. d e f f ( x , y ) :

    r e t u r n s q r t ( x * x + y * y ) p r i n t ( f ( 3 , 4 ) ) # = > 5 x = a r r a y ( [ 3 , 8 , 6 0 ] ) y = a r r a y ( [ 4 , 6 , 8 0 ] ) p r i n t ( f ( x , y ) ) # = > a r r a y ( [ 5 , 1 0 , 1 0 0 ] )
  113. # N u m b e r ! j d

    = t o d a y ( ) p = e a r t h ( j d ) . o b s e r v e ( p l a n e t ) # V e c t o r ! j d = d a t e _ r a n g e ( ' 1 9 8 0 / 1 / 1 ' , ' 2 0 1 0 / 1 / 1 ' , 1 . 0 ) p = e a r t h ( j d ) . o b s e r v e ( p l a n e t )
  114. But scalar style is hard to maintain across a large

    code base! n = c o m p u t e _ n u t a t i o n ( j d ) p = c o m p u t e _ p r e c e s s i o n ( j d . t d b ) f = J 2 0 0 0 _ t o _ I C R S t = e i n s u m ( ' j i n , k j n - > i k n ' , n , p ) t = e i n s u m ( ' i j n , j k - > i k n ' , t , f ) p o s = e i n s u m ( ' i n , i j n - > j n ' , p o s , t ) v e l = e i n s u m ( ' i n , i j n - > j n ' , v e l , t )
  115. One last hint Vi Hart says τ = 2ᵰ

  116. Q: Should dunder-init import things? # _ _ i n

    i t _ _ . p y f r o m . e a r t h l i b i m p o r t T o p o s f r o m . p l a n e t s i m p o r t J u p i t e r to allow f r o m s k y f i e l d i m p o r t T o p o s , J u p i t e r
  117. Pro Simple way to surface your primary interface

  118. Cons Innocent i m p o r t s k

    y f i e l d winds up importing everything
  119. Cons Imports do not match messages f r o m

    s k y f i e l d i m p o r t T o p o s p r i n t t y p e ( T o p o s ) # = > < c l a s s ' s k y f i e l d . c o o r d i n a t e s . T o p o s ' >
  120. Skyfield: decided against Not a one-trick-pony library Teach package structure

    Can always s k y f i e l d . a p i
  121. Opinionated defaults

  122. Should Skyfield include an ephemeris by default? s e t

    u p ( ⋮ i n s t a l l _ r e q u i r e s = [ ' d e 4 2 1 ' ] , )
  123. Avoid Contrived tests

  124. How do you test both branches of the i f

    ? d e f f ( . . . ) : ⋮ # 2 0 l i n e s o f c o d e ⋮ i f f i n a l _ d e c i s i o n ( ) : a c t i o n 1 e l s e : a c t i o n 2
  125. Do not contrive complex calls to f ( ) that

    exercise both branches Instead factor the function into smaller pieces that can be exercised separately
  126. Final trick Choosing a support forum

  127. Mailing list? Web forum? What happens 4 years later when

    someone has the same question?
  128. Answer can be difficult to find within a long discussion

    thread
  129. Answers go out of date

  130. Stack Overflow! Answer lives on same page Out-of-date answers get

    fixed
  131. The last mile

  132. The last mile Lesson #1

  133. New GitHub issue

  134. “PyEphem works from ssh but it does not work from

    the web”
  135. I was unhappy Hurts project reputation Issue says “you are

    wrong”
  136. None
  137. “Can you help me?”

  138. None
  139. “This question is not a good fit to our Q&A

    format”
  140. “Can you help me?”

  141. yes

  142. i m p o r t s y s o

    p e n ( ' / t m p / e m e r g e n c y . l o g ' , ' w ' ) . w r i t e ( s t r ( s y s . p a t h ) + ' \ n ' )
  143. i m p o r t s y s s

    y s . p a t h . a p p e n d ( ' / h o m e / a s t r o n o m i a / . l o c a l / l i b 6 4 ' ' / p y t h o n 2 . 6 / s i t e - p a c k a g e s ' ) It worked
  144. “FYI the support solution is to keep that emergency code

    in every script”
  145. None
  146. “Brandon, thanks a lot for your help. PyEphem is great.”

  147. The last mile Lesson #2

  148. October 2012 — I put NOVAS on PyPI p i

    p i n s t a l l n o v a s
  149. An email “I have been fiddling around with NOVAS_Py-3.1 and

    have had some problems…” (a list of bugs followed)
  150. The temptation Not my library — not my problem Provide

    the email address of USNO? Forward it to them myself?
  151. None
  152. A s s i s t a n t D

    i r e c t o r f o r E x p l o r a t i o n . . . @ n a s a . g o v
  153. USNO → me → NASA

  154. The last mile

  155. The last mile In open source, the final component of

    the API is very often you! You are the the API of last resort
  156. The last mile In open source, the final component of

    the API is very often you! You are the the API of last resort @brandon_rhodes Thank you very much!