Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ScalaPB: Protocol Buffer Case Class Generator in Scala

ScalaPB: Protocol Buffer Case Class Generator in Scala

In this talk, I will give a brief overview of Protocol Buffers, and when it makes sense to use them versus other data serialization formats such as JSON. We will then dive into ScalaPB, explore its API and see how the use of lensing enables a very elegant syntax of updating deeply nested immutable data structures. I will show some unique features of ScalaPB such as custom type mappers. Finally, as protocol buffers can represent arbitrarily complex data structures, I will show how ScalaCheck is used to generate increasingly large protocol buffer structures (and fill them in with random data) to validate the correctness of the library by way of property checking. If there's time left, I will also give a sneak peek of the upcoming release that supports the new proto3 syntax.

Nadav Samet

May 26, 2015
Tweet

Other Decks in Technology

Transcript

  1. ROADMAP Introduction to Protocol Buffers Why ScalaPB? Taste of ScalaPB

    API Using lenses for updating nested fields. Property-based testing
  2. WHAT ARE PROTOCOL BUFFERS? Language that describes data structures. Defines

    encoding as bytes. Compiler generates code for Java, C++, Python and more.
  3. EXAMPLE 1: SYNTAX m e s s a g e

    P e r s o n { o p t i o n a l s t r i n g f i r s t _ n a m e = 1 ; o p t i o n a l s t r i n g l a s t _ n a m e = 2 ; o p t i o n a l i n t 3 2 a g e = 3 ; } $ p r o t o c p e r s o n . p r o t o - j a v a _ o u t = .
  4. Output: EXAMPLE 1: SERIALIZING v a l p : P

    e r s o n = P e r s o n . n e w B u i l d e r ( ) . s e t F i r s t N a m e ( " J o h n " ) . s e t L a s t N a m e ( " S m i t h " ) . s e t A g e ( 3 4 ) . b u i l d ( ) p . t o B y t e A r r a y ( ) . t o V e c t o r V e c t o r ( 1 0 , 4 , 7 4 , 1 1 1 , 1 0 4 , 1 1 0 , 1 8 , 5 , 8 3 , 1 0 9 , 1 0 5 , 1 1 6 , 1 0 4 , 2 4 , 3 4 )
  5. Output: EXAMPLE 1: PARSING v a l b = A

    r r a y [ B y t e ] ( 1 0 , 4 , 7 4 , 1 1 1 , 1 0 4 , 1 1 0 , 1 8 , 5 , 8 3 , 1 0 9 , 1 0 5 , 1 1 6 , 1 0 4 , 2 4 , 3 4 ) p r i n t l n ( P e r s o n . p a r s e F r o m ( b ) ) f i r s t _ n a m e : " J o h n " l a s t _ n a m e : " S m i t h " a g e : 3 4
  6. NESTED AND REPEATED FIELDS m e s s a g

    e A d d r e s s { o p t i o n a l s t r i n g s t r e e t = 1 ; o p t i o n a l s t r i n g c i t y = 2 ; } m e s s a g e B a n k A c c o u n t { o p t i o n a l s t r i n g r o u t i n g _ n u m b e r = 1 ; o p t i o n a l s t r i n g a c c o u n t _ n u m b e r = 2 ; } m e s s a g e P e r s o n { o p t i o n a l s t r i n g f i r s t _ n a m e = 1 ; o p t i o n a l s t r i n g l a s t _ n a m e = 2 ; o p t i o n a l i n t 3 2 a g e = 3 ; o p t i o n a l A d d r e s s a d d r e s s = 4 ; r e p e a t e d B a n k A c c o u n t b a n k _ a c c o u n t s = 5 ; }
  7. BENEFITS AND USE CASES ...or why not JSON? Efficient binary

    format. Easier to programatically access the data. Type-safe inter-process communication. Type-safe cross-team communication. Persistent storage format. Great for evolving schemas.
  8. SCHEMA UPDATES BEFORE m e s s a g e

    P e r s o n { o p t i o n a l s t r i n g f i r s t _ n a m e = 1 ; o p t i o n a l s t r i n g l a s t _ n a m e = 2 ; o p t i o n a l i n t 3 2 a g e = 3 ; } AFTER m e s s a g e P e r s o n { o p t i o n a l s t r i n g g i v e n _ n a m e = 1 ; o p t i o n a l s t r i n g s u r n a m e = 2 ; / / o p t i o n a l i n t 3 2 a g e = 3 ; o p t i o n a l i n t 3 2 y e a r _ o f _ b i r t h = 4 ; }
  9. SCHEMA UPDATES Things you can do: Add optional fields. Remove

    optional fields. Rename fields. Convert between compatible types. Convert optional fields to repeated. Convert a repeated field to optional.
  10. SCHEMA UPDATES Things you can't do: Change a field's type

    (unless it's compatible) Remove a required field. Add a required field. Don't use required fields They are going away anyway.
  11. JAVA PROTOBUFS Or why builders are annoying... v a l

    p : P e r s o n = P e r s o n . n e w B u i l d e r ( ) . s e t F i r s t N a m e ( " J o h n " ) . s e t L a s t N a m e ( " S m i t h " ) . g e t A d d r e s s B u i l d e r . s e t S t r e e t ( " 1 5 3 M a i d e n L a n e " ) . s e t C i t y ( " S a n F r a n c i s c o " ) . b u i l d ( )
  12. JAVA PROTOBUFS Or why builders are annoying... v a l

    p : P e r s o n = P e r s o n . n e w B u i l d e r ( ) . s e t F i r s t N a m e ( " J o h n " ) . s e t L a s t N a m e ( " S m i t h " ) . g e t A d d r e s s B u i l d e r / / < - - W e g e t a n a d d r e s s b u i l d e r . . . . s e t S t r e e t ( " 1 5 3 M a i d e n L a n e " ) . s e t C i t y ( " S a n F r a n c i s c o " ) . b u i l d ( ) / / < - - B O O M ! ! ! t y p e m i s m a t c h ; [ e r r o r ] f o u n d : c o m . e x a m p l e . T e s t . A d d r e s s [ e r r o r ] r e q u i r e d : c o m . e x a m p l e . T e s t . P e r s o n [ e r r o r ] . b u i l d ( )
  13. JAVA PROTOBUFS Or why builders are annoying... v a l

    b u i l d e r : P e r s o n . B u i l d e r = P e r s o n . n e w B u i l d e r . s e t F i r s t N a m e ( " J o h n " ) . s e t L a s t N a m e ( " S m i t h " ) b u i l d e r . g e t A d d r e s s B u i l d e r . s e t S t r e e t ( " 1 5 3 M a i d e n L a n e " ) . s e t C i t y ( " S a n F r a n c i s c o " ) v a l p : P e r s o n = b u i l d e r . b u i l d ( ) i f ( p . h a s B a n k A c c o u n t ) N o n e e l s e S o m e ( p . g e t B a n k A c c o u n t )
  14. SCALAPB Compiles to case classes. Lenses for easy updates. Converters

    to and from Java. Written as a protoc plugin.
  15. SCALAPB: CASE CLASSES m e s s a g e

    P e r s o n { o p t i o n a l s t r i n g f i r s t _ n a m e = 1 ; o p t i o n a l s t r i n g l a s t _ n a m e = 2 ; o p t i o n a l i n t 3 2 a g e = 3 ; o p t i o n a l A d d r e s s a d d r e s s = 4 ; r e p e a t e d B a n k A c c o u n t b a n k _ a c c o u n t s = 5 ; } c a s e c l a s s P e r s o n ( f i r s t N a m e : O p t i o n [ S t r i n g ] = N o n e , l a s t N a m e : O p t i o n [ S t r i n g ] = N o n e , a g e : O p t i o n [ I n t ] = N o n e , a d d r e s s : O p t i o n [ A d d r e s s ] = N o n e , b a n k A c c o u n t s : S e q [ B a n k A c c o u n t ] = N i l ) { d e f t o B y t e A r r a y : A r r a y [ B y t e ] . . . } o b j e c t P e r s o n { d e f p a r s e F r o m ( b : A r r a y [ B y t e ] ) : P e r s o n }
  16. SCALAPB: CASE CLASSES v a l p = P e

    r s o n ( f i r s t N a m e = S o m e ( " J o h n " ) , l a s t N a m e = S o m e ( " S m i t h " ) , a d d r e s s = S o m e ( A d d r e s s ( s t r e e t = S o m e ( " 1 5 3 M a i d e n L a n e " ) , c i t y = S o m e ( " S a n F r a n c i s c o " ) ) ) ) Better than builders, but lots of Some()
  17. SCALAPB: CASE CLASSES Nested updates are verbose: v a l

    p 2 = p . c o p y ( a d d r e s s = S o m e ( p . a d d r e s s . g e t . c o p y ( s t r e e t = S o m e ( " O t h e r S t r e e t " ) ) ) ) Fixed by lenses: v a l p 2 = p . u p d a t e ( _ . a d d r e s s . s t r e e t : = " O t h e r S t r e e t " )
  18. NESTED UPDATES c a s e c l a s

    s A ( b : B ) c a s e c l a s s B ( c : C ) c a s e c l a s s C ( d : D ) c a s e c l a s s D ( e : I n t ) v a l a = A ( B ( C ( D ( e = 4 2 ) ) ) ) Immutable update: v a l a 2 = a . c o p y ( b = a . b . c o p y ( c = a . b . c . c o p y ( d = a . b . c . d . c o p y ( e = 1 7 ) ) ) ) Mutable update: a . b . c . d . e = 1 7
  19. LENSES: INTRO t r a i t L e n

    s [ C o n t a i n e r , A ] { d e f g e t ( c : C o n t a i n e r ) : A d e f s e t ( a : A ) ( c : C o n t a i n e r ) : C o n t a i n e r d e f : = ( a : A ) = s e t ( a ) } o b j e c t P e r s o n A d d r e s s L e n s e x t e n d s L e n s [ P e r s o n , A d d r e s s ] { d e f g e t ( c : P e r s o n ) = p e r s o n . a d d r e s s d e f s e t ( a : A d d r e s s ) ( c : P e r s o n ) = c . c o p y ( a d d r e s s = a ) } v a l p 1 = P e r s o n A d d r e s s L e n s . s e t ( n e w A d d r e s s ) ( p ) v a l p 1 = ( P e r s o n A d d r e s s L e n s : = n e w A d d r e s s ) ( p )
  20. LENSES COMPOSE P e r s o n A d

    d r e s s L e n s : L e n s [ P e r s o n , A d d r e s s ] A d d r e s s S t r e e t L e n s : L e n s [ A d d r e s s , S t r i n g ] Compose them to obtain: Which allows to update the street through a person: P e r s o n S t r e e t L e n s : L e n s [ P e r s o n , S t r i n g ] v a l p 2 : P e r s o n = ( P e r s o n S t r e e t L e n s : = " T o w n s e n d " ) ( p ) v a l p 2 : P e r s o n = ( P e r s o n L e n s . s t r e e t : = " T o w n s e n d " ) ( p )
  21. Generate lenses that enable: SCALAPB LENSES v a l p

    = P e r s o n ( ) . u p d a t e ( _ . f i r s t N a m e : = " J o h n " , _ . o p t i o n a l L a s t N a m e : = S o m e ( " S m i t h " ) , _ . a d d r e s s . s t r e e t : = " T o w n s e n d " , _ . a d d r e s s . c i t y . m o d i f y ( _ . t o U p p e r C a s e ) , _ . b a n k A c c o u n t s : + + = s e q O f A c c o u n t s , _ . b a n k A c c o u n t s . f o r e a c h ( _ . a c c o u n t N u m b e r : = " * * * * " ) )
  22. TESTING Protocol Buffers are complex: 15 primitive types Enums Repeated,

    required and optional fields Packed repeated fields Nested messages Default values Imports One-ofs (union fields) Maps (in proto3) Two language versions
  23. TESTING Use Scalacheck to generate huge random protocol buffers. Compare

    with Java as reference implementation. i m p o r t " f z k p . p r o t o " ; m e s s a g e j 1 7 { e n u m K 1 1 z { o = 1 ; z b b = 2 ; } m e s s a g e U s s z p x { o p t i o n a l s i n t 3 2 f a l v = 1 9 ; o p t i o n a l U s s z p x y q e v l = 1 1 ; . . . } o p t i o n a l U s s z p x u v f = 3 1 ; . . . }
  24. BONUS SLIDE WRITING A PROTOC PLUGIN protoc provides the plugin

    a representation of a parsed proto file n a m e : " P e r s o n " f i e l d { n a m e : " f i r s t _ n a m e " n u m b e r : 1 l a b e l : L A B E L _ O P T I O N A L t y p e : T Y P E _ S T R I N G } f i e l d { n a m e : " l a s t _ n a m e " n u m b e r : 2 l a b e l : L A B E L _ O P T I O N A L t y p e : T Y P E _ S T R I N G }
  25. BONUS SLIDE UPDATE() SIGNATURE c a s e c l

    a s s P e r s o n ( . . . ) { d e f u p d a t e ( m s : ( P e r s o n L e n s = > ( P e r s o n = > P e r s o n ) ) * ) : P e r s o n }