Profiles References HTTP and Me or, How I Learned to HATEOAS and Love to REST Christopher Harrison [email protected] Human Genetics Informatics · Wellcome Trust Sanger Institute Wednesday 28th January, 2015
Profiles References The Ghost of Web APIs Past • HTTP was designed as the application layer for hypermedia information systems. • Not restricted to hypermedia formats such as HTML—e.g., binary data, such as images—but these are nonetheless meant to provide the primary interface.
Profiles References The Ghost of Web APIs Past • HTTP was designed as the application layer for hypermedia information systems. • Not restricted to hypermedia formats such as HTML—e.g., binary data, such as images—but these are nonetheless meant to provide the primary interface. • Loose constraints and proliferation eroded this notion and arbitrary data was tacked on the back of HTTP for RPC.
Profiles References SOAP and WSDL • SOAP is a protocol for exchanging structured data, in the form of an XML blob, usually (but not necessarily) over HTTP. • End points are used like an I/O pipe, so the semantics of the data have to be encoded in the XML blob for delegation to the server. • Those semantics are completely arbitrary, so an external user of the service can’t know how to write their XML without explicit documentation from the service provider. • WSDL (web services description language) alleviates this problem, somewhat, by providing a structured document that can be used to generate valid clients.
Profiles References So we have to generate the WSDL from the service implementation, so we can generate a client that can generate legal XML which the service can understand!?
Profiles References The Ghost of Web APIs Present • Single HTTP end points are not meant to be used as global RPC sinks. • Application state should be transparent to clients. • Nobody really likes XML!
Profiles References The Ghost of Web APIs Present • Single HTTP end points are not meant to be used as global RPC sinks. Solution The R in URL stands for “resource”: We can separate and group application end points out by purpose. • Application state should be transparent to clients. Solution The semantics of generic HTTP methods, combined with a more productive use of end points. • Nobody really likes XML! Solution While XML is still used, it is not restricted to heavyweight SOAP packets. That said, JSON use has surged in popularity.
Profiles References The Problems • While more distributed, current RESTful APIs are still mostly just RPC interfaces. The S in REST is about state, not calling procedures. • Client writers still need explicit documentation to navigate the semantics of an API’s routes. There’s no WSDL equivalent, but the facility has been available since the beginning: The H in HTTP. • Despite JSON’s popularity, it is not a hypermedia format and attempts to solve this have yet to fully crystallise.
Profiles References HTTP Overview • Application layer, request-response protocol following the client-server model. As such, it is a “stateless” protocol (cf. SSH). • Resources are identified and located via URIs (specifically, URLs), which form a tree-like hierarchy. This addressing mechanism facilitates arbitrary links between resources via hypertext documents (hence “web”). • A limited set of desired actions (called “Request Methods” or “verbs”) can be performed against any particular resource. What this (method, resource) tuple represents—if anything—is implementation specific, but semantically contingent upon the verb.
Profiles References Request Methods • The HTTP/1.1 specification defines eight methods. Additional methods can be created and are often standardised (e.g., PATCH). The point is that methods are well-known and their semantics agreed upon. • Semantics are defined in terms of potency: Nullipotent Returns a resource without affecting it (also known as a “safe” method). Idempotent An action which, when performed multiple times against a resource, has no further effect than had it been done just once. (Note that nullipotency implies idempotency.)
Profiles References HTTP Method Typology for Web Services Method Nullipotent Idempotent GET - Retrieve the resource. HEAD - Same as GET, but only returns the response header for the resource. POST Create subordinate data under the resource address. PUT Upsert (create or update) data at the address. DELETE Delete the resource. OPTIONS - Return the HTTP methods supported by the resource. PATCH Apply partial modification to a resource.
Profiles References REST Overview • REST is an architectural pattern, running off the back of the HTTP (or HTTP-like) layer, with a number of constraints (known as “Fielding constraints”). • In principle, representations are addressed by resources, using URLs, and state changes are mapped to HTTP methods. This gives us a uniform interface, unlike the fixed interface of SOA-style applications like SOAP. • Representations can make use of hypermedia to imply application semantics to a client. This is known as the Hypermedia as the Engine of Application State (HATEOAS) constraint. That is, there should be no need for API end points to be documented, in a traditional sense; instead it becomes “discoverable”.
Profiles References The World Wide Web • The World Wide Web, driven by a human agent, is the best advertisement for REST. • An end user consumes representations and follows links per their requirement. • The semantics of HTML (e.g., anchors, forms, etc.) allows them to do this efficiently and unambiguously.
Profiles References RESTful APIs • The goal of a real RESTful API is to mimic the web, albeit a domain specific subset, in a trasparent (i.e., machine readable) way. • A client is simply provided with an entry point (cf. website), which it can consume by following links and interpreting whatever data is fed to it. • That is, a RESTful API should be entirely self-describing and application semantics aware.
Profiles References The HATEOAS Constraint • A client enters a REST API at a single, fixed URL. • All future actions that can be made by the client are discovered from the resource representations returned by the server. • The media types, and any “link relations” contained within them, are standardised so a client can understand them. • A client transitions through application states by following links within a representation, or by manipulating that representation appropriately.
Profiles References Client Types Human Interactively browses an API, which is some how translated from its machine readable representation into a human readable equivalent. Crawler Indexes an API in its entirety without user intervention, by just following links, to “map” the API’s functionality (e.g., a search engine spider). Monitor Periodically checks the representation of a single resource, without following links (e.g., an RSS aggregator). Script Follows a specific path through an API to achieve a certain result; everything is prescribed. Agent Provided the API’s metadata is rich enough, can make meaningful decisions that ultimately invoke the required state changes for its purpose.
Profiles References Link Relations • An attribute attached to a hyperlink that describes the type of link or the relationship between destination and source. • They can be embedded into appropriate media types (e.g., HTML), or included within the HTTP response headers. • The IANA maintain a registry of standard link relations, but others can be defined by URL. • Link relations can be used to encode application semantics for an agent to consume.
Profiles References Hypermedia Media Types • “Media type” is a standard identifier, curated by the IANA, to indicate the format of a file’s data. Some, albeit few, natively support hyperlinking. • JSON (application/json) is commonly used to communicate with an API, but it is not a hypermedia format. As such, derivatives have appeared that impose structure on to JSON to give data link and, occasionally, relation keys (e.g., HAL, Siren, etc.). • This process hasn’t really taken off, largely from lack of exposure and the additional restrictions that these proposals assert.
Profiles References Why Not Just Use HTML? • Why not, indeed! This is the track taken by “microformats”, where application semantics are embedded into HTML elements using, for example, class attributes. • In general, however, HTML is designed for human consumption, so it comes with a lot of redundant functionality, without the specificity that a machine requires. • The best compromise would be some XML schema, with a DTD, that uses XLink to facilitate hypermedia. If it weren’t for the “Nobody really likes XML!” problem, this would almost be ideal. . .
Profiles References Non-Proliferation of Media Types • Profiles use a similar mechanism as microformats to embed application semantics into arbitrarily structured data; i.e., by assigning meaning to specific keys. For example, ALPS documents can apply to XML (tags) and JSON (object members) alike. • Profile documents can be human or machine readable (or both) and should be linked to the resource in question, either by using any support provided by the resource’s media type or as part of of the Link HTTP response header field. • If a profile document is also a hypermedia document, we can construct a rich and orthogonal dependency graph for an application’s semantics.
Profiles References What About RDF? • The Resource Description Framework is a family of specifications to model/conceptualise the semantics of resources. • It uses a variety of serialisations formats, including a JSON derivative: JSON-LD (JSON for Linked Data), which can embed contextual information into itself, or via the HTTP Link header. • One can think of it as a kind of “type annotation” system for plain JSON.
Profiles References Can We Go Further? • Don’t simple profiles just reinvent DTDs, but as a “woolly”, generalised version that replaces structural constraints with a vague notion of semantics? • By imposing structural constraints, we pigeonhole ourselves by media type (e.g., XML is richer than JSON). However, can we create structure using algebraic typing of primitives? • Type annotation—as with JSON-LD—is a good start, but can we use an analogue to strong typing and inference in our profile graphs to construct rich schemata that also encode application semantics?
Profiles References Further Reading • Application-Level Profile Semantics (ALPS) Amundsen, Richardson and Foster (2014–15); IETF, Internet-Draft • JSON-LD 1.0 Sporny et al (2014); W3C Recommendation • RESTFul Web APIs Richardson and Amundsen (2013); O’Reilly • The ‘profile’ Link Relation Type Wilde (2013); IETF, RFC6906 • Web Linking Nottingham (2010); IETF, RFC5988 • Architectural Styles and the Design of Network-Based Software Architectures Fielding (2000); University of California, Irvine • Hypertext Transfer Protocol – HTTP/1.1 Fielding et al (1999); IETF, RFC2616