Sequential Operations With REST

API Days Decembre 2023 Handle sequential operations with REST and
PHP (within API Platform)

Grégoire Hébert Principal Developer Author of "Memex - La route
du REST" @gregoirehebert @gheb_dev

Alternatives What is the subject, and why? 02 03 Summary
04 05 01 Usages Research and implementation What's next?

1 What is the subject and why? @gregoirehebert @gheb_dev Sequential
operations with REST

@gregoirehebert @gheb_dev When might we encounter this situation ? ➔
Submission of forms in several passes 1 What is the subject and why? In a performance or an economical approach, how can we optimise the sending of requests? And, if we are in fintech, where financial related messages are sent, ensure that they are executed in succession, or in groups? Maybe invalidate the first two if the third don't succeed?

@gregoirehebert @gheb_dev 1 What is the subject and why? When
might we encounter this situation ? ➔ Submission of forms in several passes ➔ Contact multiple endpoint at the same time

@gregoirehebert @gheb_dev When might we encounter this situation ? ➔
Submission of forms in several passes ➔ Contact multiple endpoint at the same time ➔ Exploit a Resource right after its creation without waiting for the result to be sent. 1 What is the subject and why?

@gregoirehebert @gheb_dev What does Respecting REST entail? ➔ Using HTTP
➔ Being stateless 1 What is the subject and why? HTTP is stateless Meaning that there is no link between two requests executed successively on the same connection. This can be problematic for users trying to interact with certain pages in a consistent way, for example using shopping carts for e-commerce. Therefore while the HTTP protocol itself is stateless, HTTP cookies allow sessions to share the same context, or state.

@gregoirehebert @gheb_dev Resolutions thanks to the evolutions of the HTTP
protocol. ➔ With HTTP/1.1 ◆ using pipelining ➔ With HTTP/2 ◆ using multiplexing 1 What is the subject and why?

@gregoirehebert @gheb_dev 1 What is the subject and why?

@gregoirehebert @gheb_dev 1 What is the subject and why? In
HTTP/1.1, we open a connection, and then the requests stack up and run one after the other. With pipelining, requests are sent in parallel, and responses will arrive in the same order. With multiplexing, the first to respond wins, the order of execution is not guaranteed. For multiplexing to work, the protocol associates each request with an identifier to correctly form the request/response tuple, and submits it as a stream.

@gregoirehebert @gheb_dev 1 What is the subject and why? And
in HTTP/3 it's even "quicker"

@gregoirehebert @gheb_dev What are the limitations? ➔ Using HTTP/1.1 ◆
with pipelining ➔ Using HTTP/2 ◆ with multiplexing 1 What is the subject and why?

What are the limitations? ➔ Using HTTP/1.1 ◆ with pipelining
• HOL, Hop by Hop, disable by default • idempotent only ➔ Using HTTP/2 ◆ with multiplexing • Order is not guaranteed @gregoirehebert @gheb_dev 1 What is the subject and why? Pipelining requests leads to improved load times, but a limitation of HTTP 1.1 still applies: the server must send its responses to the same order as the requests were received - so the entire connection remains FIFO and Head Of Line blocking can occur. Example: if a client sends 4 pipelined GET requests to a proxy over a single connection and the first one is not in its cache, the proxy must forward this request to the destination web server; if the next three requests are instead found in its cache, the proxy must wait for the web server's response, then send it to the client and only then it can also send the three cached responses. Furthermore, POST requests cannot be pipelined, only idempotent verbs And finally, HTTP connection handling is Hop By Hop not end to end. It is transmitted, intermediaries by intermediaries. If only one does not support or activate pipelining, the interest is lost.

@gregoirehebert @gheb_dev What if it cannot be in HTTP/2? 1
What is the subject and why? What if I don't have HTTP/2? Then HTTP/1.1

@gregoirehebert @gheb_dev Another alternative ➔ Domain Sharding : ww1.example.com (prefer
HTTP/2) ➔ Batch Operations. 1 What is the subject and why? When limited to HTTP/1.1 and the number of connections established, you can always create multiple domains to be able to contact the same server multiple times. But, of course, here HTTP/2 and 3 is a better choice. Another solution is possible via the protocol to pretty much bypass the notion of statelessness, it is to embed several requests in a single one.

@gregoirehebert @gheb_dev 1 What is the subject and why? We
are talking about a batch request.

2 Research and implementations @gregoirehebert @gheb_dev

@gregoirehebert @gheb_dev 1 2 Research and implementations https://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html How can
we manipulate this to our advantage? Here is what an HTTP request is, taken from https://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html : A request message from a client to a server includes, in the first line of the message, the method to be applied to the resource, the resource ID, and the protocol version used. The goal is to send this information, several times at a time, in the same request. First reflex, if I have the idea, the others must have had it too (before).

@gregoirehebert @gheb_dev ➔ https://datatracker.ietf.org/doc/html/draft-snell-http-batcxxh-01 ➔ https://www.w3.org/Protocols/rfc1341/7_2_Multipart.html ➔ https://www.odata.org/documentation/#stq=batch&stp=1 ◆ http://docs.oasis-open.org/odata/odata/v4.01/odata-v4.01-part1-protocol.html#sec_BatchRequests
◆ http://docs.oasis-open.org/odata/odata/v4.01/odata-v4.01-part1-protocol.html#sec_Preferencecontinueonerrorodatacontin ◆ http://docs.oasis-open.org/odata/odata/v4.01/odata-v4.01-part1-protocol.html#sec_Preferencerespondasync ➔ https://developers.google.com/gmail/api/guides/batch ➔ https://developers.google.com/people/v1/batch ➔ https://cloud.google.com/storage/docs/batch ➔ https://www.doctrine-project.org/projects/doctrine-dbal/en/latest/reference/transactions.html#transaction-nesting 1 2 Research and implementations

@gregoirehebert @gheb_dev ➔ https://datatracker.ietf.org/doc/html/draft-snell-http-batch-01 ➔ Send everything in one endpoint
1 2 Research and implementations IETF, is usually a good place to start, and James M. Snell (an IBM engineer who also worked on PATCH, prefer header, or Json-Merge-Patch) proposed a project to handle batch requests.

➔ Uses the multipart/http Content-Type header. 1 2 Research and implementations

➔ Uses the multipart/http Content-Type header. ➔ Each subquery is described in the payload, separated by a delimiting string. 1 2 Research and implementations

➔ Uses the multipart/http Content-Type header. ➔ Each subquery is described in the payload, separated by a delimiting string. ➔ Each subquery is identified by a Content-ID to associate the responses. 1 2 Research and implementations

@gregoirehebert @gheb_dev ➔ https://datatracker.ietf.org/doc/html/draft-snell-http-batch-01 1 2 Research and implementations ➔
Send everything in one endpoint ➔ Uses the multipart/http Content-Type header. ➔ Each subquery is described in the payload, separated by a delimiting string. ➔ Each subquery is identified by a Content-ID to associate the responses. ➔ We cannot assume the execution order of each subquery.

➔ Uses the multipart/http Content-Type header. ➔ Each subquery is described in the payload, separated by a delimiting string. ➔ Each subquery is identified by a Content-ID to associate the responses. ➔ We cannot assume the execution order of each subquery. ➔ The main query cannot be cached, but the subqueries can be. 1 2 Research and implementations In addition, each sub-query has its own authentication and authorisation handling. A batch query should not inherit the authentication/authorisation of the multipart query that contains it or the other individual queries. Just because a user is authorised to submit a batch query does not mean they are authorised to submit each of the individual queries. This is a really good start, but there is room for improvement. Let's dive into this Content-Type multipart.

@gregoirehebert @gheb_dev ➔ Multipart 1 2 Research and implementations A
typical Content-Type multipart header field might look like this: This indicates that the entity consists of multiple parts, each with a syntactically identical structure of an RFC 822 message, except that the startline and header field part may be completely empty and the parts are each preceded by the line we see below. This string can be any string.

@gregoirehebert @gheb_dev ➔ Multipart 1 2 Research and implementations NEXT
DETAIL

@gregoirehebert @gheb_dev ➔ Multipart 1 2 Research and implementations

@gregoirehebert @gheb_dev ➔ Multipart ➔ Multipart/mixed 1 2 Research and
implementations ➔ can contain anything

@gregoirehebert @gheb_dev ➔ Multipart ➔ Multipart/mixed ➔ Multipart/alternative 1 2
Research and implementations ➔ used for email usually

@gregoirehebert @gheb_dev ➔ Multipart ➔ Multipart/mixed ➔ Multipart/alternative ➔ Multipart/digest
1 2 Research and implementations ➔ variation of mixed containing multiple messages (usually for emails when there are messages from different people)

➔ Multipart/parallel 1 2 Research and implementations ➔ the parts are intended to be presented in parallel for hardware and software that are capable of doing so.

➔ Multipart/parallel ➔ A sub message can be Multipart itself. 1 2 Research and implementations Now that we know more, let's see the others, so I looked at what Google was doing, and it's quite similar to what the IETF draft proposes. There is also Odata from Microsoft, which is quite similar, with some nice additions and some that are less pain free.

@gregoirehebert @gheb_dev ➔ Odata 1 2 Research and implementations ➔
Dependencies between queries

Dependencies between queries ➔ References in the URL of subsequent queries.

Dependencies between queries ➔ References in the URL of subsequent queries. ➔ Values of a response body in the query part of the url or in the body of subsequent queries.

Dependencies between queries ➔ References in the URL of subsequent queries. ➔ Values of a response body in the query part of the url or in the body of subsequent queries. ➔ Individual processing in the order of reception.

Dependencies between queries ➔ References in the URL of subsequent queries. ➔ Values of a response body in the query part of the url or in the body of subsequent queries. ➔ Individual processing in the order of reception. ➔ Stop at the first error, unless the continue-on-error preference is specified.

Dependencies between queries ➔ References in the URL of subsequent queries. ➔ Values of a response body in the query part of the url or in the body of subsequent queries. ➔ Individual processing in the order of reception. ➔ Stop at the first error, unless the continue-on-error preference is specified. ➔ Apply all, or nothing.

@gregoirehebert @gheb_dev ➔ Odata ➔ Dependencies between queries ➔ References
in the URL of subsequent queries. ➔ Values of a response body in the query part of the url or in the body of subsequent queries. ➔ Individual processing in the order of reception. ➔ Stop at the first error, unless the continue-on-error preference is specified. ➔ Apply all, or nothing. ➔ Odata has been standardized by OASIS and approved as an international ISO/IEC standard. 1 2 Research and implementations Odata offers a more complete solution but also more complex while having the same foundations as the previous ones. It is therefore a good base. So for API Platform, I chose to implement batch following the Odata specification.

@gregoirehebert @gheb_dev ➔ Useful knowledge about PHP 1 2 Research
and implementations We need to be able to parse the HTTP request, and extract every part of it. Everything that has been sent must be extracted.

@gregoirehebert @gheb_dev ➔ Useful knowledge about PHP 1 2 Research
and implementations Unfortunately, there is no API offered by the language to get the original HTTP request in our hands. Usually, everything has been processed by FCGI to be accessible from the global variables. Often, PHP is executed by CGI, in a more modern way with FastCGI itself served by FPM FastCgI Process Manager.

@gregoirehebert @gheb_dev 1 2 Research and implementations ➔ Useful knowledge
about PHP the HTTP Request is split into specific bits, in and out.

about PHP And fastCGI chops the HTTP message before sending it to PHP. Here, we have to use all that PHP offers, but also parse the body of the request, then it into new sub-requests. Because a multipart/mixed HTTP request can have recursive subparts, and because one of them can theoretically contain files and thus increase memory substantially, we can't just take the whole body and parse it without risking memory problems. We have to deal with the submitted data through a stream.

about PHP All that we need comes from the standard input value.

@gregoirehebert @gheb_dev ➔ Wrappers & Stream ➔ ﬁle:// ➔ http://
➔ ftp:// ➔ php:// ➔ zlib:// ➔ data:// ➔ glob:// ➔ phar:// ➔ php:// ➔ ssh2:// ➔ rar:// ➔ ogg:// ➔ expect:// 1 2 Research and implementations To grab the payload in PHP, we need to go through the available wrappers

➔ ftp:// ➔ php:// ➔ zlib:// ➔ data:// ➔ glob:// ➔ phar:// ➔ ssh2:// ➔ rar:// ➔ ogg:// ➔ expect:// ➔ php://stdin ➔ php://stdout ➔ php://stderr ➔ php://input ➔ php://output ➔ php://fd ➔ php://memory ➔ php://temp ➔ php://ﬁlter 1 2 Research and implementations php:// in particular

➔ ftp:// ➔ php:// ➔ zlib:// ➔ data:// ➔ glob:// ➔ phar:// ➔ ssh2:// ➔ rar:// ➔ ogg:// ➔ expect:// ➔ php://stdin ➔ php://stdout ➔ php://stderr ➔ php://input ➔ php://output ➔ php://fd ➔ php://memory ➔ php://temp ➔ php://ﬁlter 1 2 Research and implementations php://stdin, or even php://input is a read-only stream that allows you to read the raw data from the request body (that's ok too). Despite this, PHP is not really designed to handle multipart outside of ﬁles, and to my knowledge at this moment, I didn't the slightest idea of any HTTP parser written in PHP. It was time to scour the internet.

@gregoirehebert @gheb_dev ➔ Librairies 1 2 Research and implementations I
found an 8 year old library, too simplistic and incomplete, and this one, a few stars... a bit scary BUT ! There are 2 known OSS contributors in it: Tobias Nyholm : Symfony Core team and Nuno Maduro: Engineer at @laravel - working on Laravel, Forge, and Vapor. Created @pestphp, php insights It might be worth taking a look. The package is used in bref, laravel, firebase, or phpleague. I happen to be able to use it but I also know that there is a will to reduce the number of dependencies in API Platform. Especially since a whole part could be part of the HttpFoundation and MIME components. So I started from scratch.

@gregoirehebert @gheb_dev ➔ Librairies 1 2 Research and implementations

@gregoirehebert @gheb_dev Sequential Operations With API Platform 3 Usages

@gregoirehebert @gheb_dev 1 2 3 Usages In API Platform, this
is achieved by changing two things in your configuration. By doing this, you have a new endpoint accessible and visible in your OpenApi documentation.

@gregoirehebert @gheb_dev 1 2 3 Usages While Swagger UI allows
you to sandbox your API and send requests from the interface, it's not really intended to be used for multi-party requests, because you don't have enough control over the headers. You can use it, but I don't recommend it.

@gregoirehebert @gheb_dev 1 2 3 Usages Let's create a traditional
HTTP resource. The beautiful Greetings.

@gregoirehebert @gheb_dev 1 2 3 Usages

@gregoirehebert @gheb_dev 1 2 3 Usages - Request Let's send
our first HTTP batch request with a multipart mixed content type.

@gregoirehebert @gheb_dev 1 2 3 Usages - Request this means
that we will include HTTP sub-requests afterwards,

@gregoirehebert @gheb_dev 1 2 3 Usages - Request

@gregoirehebert @gheb_dev 1 2 3 Usages - Request Maintaining an
HTTP Parser is not easy to do, but do you know what is?

@gregoirehebert @gheb_dev 1 2 3 Usages - JSON json !
It is very convenient, because it's easy to produce. And API Platform is already capable of dealing with it, But, on the other hand, there is a big disadvantage against HTTP/multipart, it is that you cannot include a batch in a batch. Not with json. At least not according to Odata Specification.

@gregoirehebert @gheb_dev 1 2 3 Usages - Embedded requests We
will retrieve a collection, which should be empty. Then we include a second batch in which we post a new greeting resource, and we retrieve the collection again, which should contain our newly created resource. NEXT, SELECTION THEN ZOOM IN

@gregoirehebert @gheb_dev 1 2 3 Usages - Embedded requests

@gregoirehebert @gheb_dev 1 2 3 Usages - Embedded requests We
will retrieve a collection, which should be empty. Then we include a second batch in which we post a new greeting resource, and we retrieve the collection again, which should contain our newly created resource. NEXT SELECTION THEN ZOOM IN

@gregoirehebert @gheb_dev 1 2 3 Usages - Embedded requests

@gregoirehebert @gheb_dev You will have noticed the following: ➔ Nested
batch request support (HTTP format only) ➔ Each sub-operation has its own request in the Symfony stack. ➔ Each request has its own proﬁler ➔ We use an Operation Post API Platform, so we can use any available operation option to some extent. 1 2 3 Usages

@gregoirehebert @gheb_dev 1 2 3 Advanced Usages - continue-on-error By
default, if one of the sub-requests fails, it cancels the execution of the following requests, and you will also have to cancel the previous valid operations. But by specifying a Prefer header with the continue-on-error attribute, all sub-requests will be processed regardless of previous successes or failures.

@gregoirehebert @gheb_dev 1 2 3 Advanced Usages - continue-on-error

@gregoirehebert @gheb_dev 1 2 3 Advanced Usages - reference Each
individual request in a batch request MUST have an id in a Content-ID header. The request id is case sensitive, MUST be unique within the batch request, and must be a positive integer. Entities created by a POST request can be referenced in the URL of subsequent requests by using the $-prefixed request id as the first segment of the request path.

@gregoirehebert @gheb_dev 1 2 3 Advanced Usages - reference In
the JSON format, it is possible to define a dependency condition on previous requests. If one or more of the previous individual requests are invalid, the current and subsequent individual requests are not processed and a status code 424 (invalid dependency) is returned. Unless you have set the continue-on-error preference. Here, the second request has been defined as dependent on request "1". But we have marked the first one as "3". The request will fail. And it is not possible to reference future requests.

4 Alternatives @gregoirehebert @gheb_dev Alternatives with API Platform ?

1 2 3 4 Alternatives @gregoirehebert @gheb_dev ➔ GET only

1 2 3 4 Alternatives @gregoirehebert @gheb_dev ➔ PUT, POST
➔ Embedded ressources + Serialization groups ➔ Custom Processor / Custom resources

1 2 3 4 Alternatives @gregoirehebert @gheb_dev ➔ RPC ➔
RPC EndPoint ➔ GraphQL EndPoint + mutations

5 What's next? @gregoirehebert @gheb_dev Is the code available ?
not yet

1 2 3 4 5 What's next @gregoirehebert @gheb_dev ❏
Atomicity check in a request group ❏ Protection against forbidden header passing related to authorization and authentication ❏ Header Parser lacks comma separation management ❏ Use of an optional application/http MIME ❏ Adding conﬁguration keys to the Batch operation ❏ Functional tests (50%) ❏ Unit tests (0%) ❏ Documentation (70%) ➔ Dev side that needs to be done I still have this amount of work to do before being satisfied, and comfortable sharing it publicly. I do this during my free time, so if this is a feature you're interested in, please contact me, we can organise a partnership to accelerate things :)

@gregoirehebert @gheb_dev ❏ Complete the draft of J. Snell and
propose a modern alternative to Odata in HTTP without the complexities and speciﬁcities of Odata. ❏ Implement Odata in API Platform? ➔ Odata vs HTTP 1 2 3 4 5 What's next While Odata is a standard, some rules have been designed for specific reasons that are not tied to HTTP requirements and it is proprietary. In addition to this, for most teams classic RESTful API are designed with OpenAPI documentation, specified with RDF formats and Ontology Web Languages. Furthermore, Odata, expect the API to be exposed with their documentation. Difficult to tell a team that the need a rewrite their documentation automation when nothing is wrong with their technology choices because they're lacking definitions. Sometimes in these situations, pursuing the previous work is also the right move while continuing pushing toward the novelties of http/2 and 3. My next goal is to specify HTTP batch request and continuing the work of J. Snell

Sequential Operations With REST

Sequential Operations With REST

More Decks by Grégoire Hébert

Other Decks in Research

Featured

Transcript