Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Beyond JSON: Fantastic Serialization Formats and Where to Find Them

Yos Riady
January 13, 2017

Beyond JSON: Fantastic Serialization Formats and Where to Find Them

Today, JSON (Javascript Object Notation) is the de-facto serialization format for exchanging data between HTTP-connected services. Several features of JSON makes it a useful general purpose format: it's human readable, easy to learn, and the ubiquity of Javascript. In this talk, let's look beyond JSON. We'll learn about three different serialization formats (JSON, MessagePack, Protocol Buffers); and discover benefits unique to each.

https://goo.gl/f6ncAQ

Yos Riady

January 13, 2017
Tweet

More Decks by Yos Riady

Other Decks in Programming

Transcript

  1. Beyond JSON
    Fantastic Serialization Formats and Where To Find Them
    Yos Riady
    yos.io
    goo.gl/f6ncAQ

    View full-size slide

  2. Beyond JSON
    Fantastic Cerealization Formats and Where To Find Them
    Yos Riady
    yos.io
    goo.gl/f6ncAQ

    View full-size slide

  3. What’s a Web API?

    View full-size slide

  4. A Web API is a website for your program
    A Web API lets software communicate with each other over
    the network.
    Serialization is a key step in this communication process.
    What medium do system communicate with?

    View full-size slide

  5. Examples of Web APIs

    View full-size slide

  6. The de-facto
    serialization format of
    today
    Background JSON
    Message
    Pack
    Protocol
    Buffers
    Conclusion
    An efficient binary
    serialization format
    Next Steps
    Introduction to
    serialization
    For serializing
    structured data

    View full-size slide

  7. Serialization, what’s that?

    View full-size slide

  8. Introduction to Serialization (and Deserialization)
    Serialization is the process of translating object state into a format that can be
    transmitted and reconstructed later.

    View full-size slide

  9. Introduction to Serialization (and Deserialization)
    Serialization is the process of translating object state into a format that can be
    transmitted and reconstructed later.

    View full-size slide

  10. For APIs, communication is key.

    View full-size slide

  11. Reasons for Serialization
    ● Communication: For transferring data between systems
    ○ Systems need a shared language to exchange information
    ○ The language has to be platform independent

    View full-size slide

  12. Data serialization
    format

    View full-size slide

  13. Challenges of Serialization
    ● Human readability
    ● Types and validation
    ● Schema evolution
    ● Interface Definition / Documentation
    ● Performance
    ● Others

    View full-size slide

  14. The de-facto
    serialization format of
    today
    Background JSON
    Message
    Pack
    Protocol
    Buffers
    Conclusion
    An efficient binary
    serialization format Next Steps
    Introduction to
    serialization
    For serializing
    structured data

    View full-size slide

  15. JSON (JavaScript Object Notation)
    ● The de facto standard for data serialization on the web
    ○ Easy to parse, generate, and read
    ○ Human readable
    ○ No schema
    ○ No type checking
    ● Easy to work with, but not very efficient over the wire
    ● No built-in schema support

    View full-size slide

  16. JSON: Human readable
    {
    "first_name": "George",
    "last_name": "Washington",
    "birthday": "1732-02-22",
    "address": {
    "street_address": "3200 Mount Vernon Memorial Highway",
    "city": "Mount Vernon",
    "state": "Virginia",
    "country": "United States"
    }
    }

    View full-size slide

  17. ● Type information from statically typed languages
    are ‘lost in translation’
    ● Validating messages is done by ad-hoc validation
    code, which needs to be written
    ○ Checking if a required attribute exists
    ○ Checking the types of an attribute
    ○ Other validations
    No types

    View full-size slide

  18. The de-facto
    serialization format of
    today
    Background JSON
    Message
    Pack
    Protocol
    Buffers
    Conclusion
    An efficient binary
    serialization format
    Next Steps
    Introduction to
    serialization
    For serializing
    structured data

    View full-size slide

  19. MessagePack
    ● Like JSON, but with efficient binary encoding
    ○ Not human readable
    ○ Smaller: Takes less space
    ○ Faster: Cut your client-server exchange traffic
    ○ Schemas & Types (IDL)
    ● Useful for systems that require low latency and high throughput.
    ○ Realtime games & systems / APIs
    ● Can be used alongside JSON

    View full-size slide

  20. MessagePack: More compact than JSON
    JSON: 27 bytes
    {“compact”: true, “schema”: 0}
    MessagePack: 18 bytes
    82 a7 63 6f 6d 70 61 63 74 c3 a6 73 63 68 65 6d 61 00

    View full-size slide

  21. MessagePack Demo

    View full-size slide

  22. The de-facto
    serialization format of
    today
    Background JSON
    Message
    Pack
    Protocol
    Buffers
    Conclusion
    An efficient binary
    serialization format
    Next Steps
    Introduction to
    serialization
    For serializing
    structured data

    View full-size slide

  23. Protocol Buffers
    ● A way of encoding structured data in an efficient yet extensible format.
    ○ “The language of data” at Google
    ○ Communication between internal services
    ● Compact binary format
    ● Schemas
    ● Client generation

    View full-size slide

  24. “We carefully craft our data models inside
    our databases, maintain layers of code to
    keep these models in check, and then allow
    all that forethought to fly out of the window
    when we want to send that data over the
    wire to another service.”

    View full-size slide

  25. Protobufs: Schemas are awesome
    // Generated Java client Code
    Person john = Person.newBuilder()
    .setId(1234)
    .setName("John Doe")
    .setEmail("[email protected]")
    .build();
    output = new FileOutputStream(args[0]);
    john.writeTo(output);

    View full-size slide

  26. message Person {
    required string name = 1;
    required int32 id = 2;
    optional string email = 3;
    enum PhoneType {
    MOBILE = 0;
    HOME = 1;
    WORK = 2;
    }
    message PhoneNumber {
    required string number = 1;
    optional PhoneType type = 2 [default = HOME];
    }
    repeated PhoneNumber phone = 4;
    }

    View full-size slide

  27. Protobufs: Schema evolution
    We only know something once we start doing it.
    Can we add new fields to our schema over time, without breaking
    backwards-compatibility?

    View full-size slide

  28. Protobufs: Backward compatibility
    message Person {
    required int32 id = 1
    required string name = 2
    optional string email = 3
    }
    message Person {
    required int32 id = 1
    required string name = 2
    optional int32 age = 4
    }
    ● Old code will happily read new
    messages and simply ignore any new
    fields
    ● To the old code, optional fields that
    were deleted will simply have their
    default value
    ● New code will also transparently read
    old messages

    View full-size slide

  29. The de-facto
    serialization format of
    today
    Background JSON
    Message
    Pack
    Protocol
    Buffers
    Conclusion
    An efficient binary
    serialization format Next Steps
    Introduction to
    serialization
    For serializing
    structured data

    View full-size slide

  30. In Closing
    ● When is JSON a good fit?
    ○ You want data to be human readable
    ○ Data is consumed directly on the browser
    ○ It’s not important to tie the data model to a schema
    ● MessagePack
    ○ When low latency and high throughput is key
    ○ Internal communication
    ● Protocol Buffers
    ○ Serializing structured data with Schemas & Types
    ○ Client generation across languages
    ○ Backward compatibility & Schema evolution
    ○ Internal communication

    View full-size slide

  31. Thanks
    Yos Riady
    yos.io

    View full-size slide

  32. Questions
    Yos Riady
    yos.io

    View full-size slide