Two years ago, I started to be interested in API-first designs. I was asked during my annual review, “what kind of project would you like to be involved with, this next year?” My response: I want to develop projects from an API-first perspective.
Little did I know that I was about to embark on a series of projects that would teach me not only about decoupled Drupal, but also the subtleties of designing proper APIs to maximize performance and minimize roundtrips to the server.
Embedding resources
I was lucky enough that the client that I was working with at the time—The Tonight Show with Jimmy Fallon—decided on a decoupled approach. I was involved in the Drupal HTTP API server implementation. The project went on to win an Emmy Award for Outstanding Interactive Program.
The idea of a content repository that could be accessed from anywhere via HTTP, and leverage all the cool technologies 2014 had to offer, was—if not revolutionary—forward-looking. I was amazed by the possibilities the approach opened. The ability to expose Drupal’s data to an external team that could work in parallel using the front-end technologies that they were proficient with meant work could begin immediately. Nevertheless, there were drawbacks to the approach. For instance, we observed a lot of round trips between the consumer of the data—the client—and the server.
As it turns out, The Tonight Show with Jimmy Fallon was only the first of several decoupled projects that I undertook in rapid succession. As a result, I authored version 2.x of the RESTful module to support the JSON API spec in Drupal 7. One of the strong points of this specification is resource embedding. Embedding resources—also called resource composition—is a technique where the response to a particular entity also contains the contents of the entities it is related to. Embedding resources for relationships is one of the most effective ways to reduce the number of round trips when dealing with REST servers. It’s based on the idea that the consumer requests the relationships that it wants embedded in the response. This same idea is used in many other specifications like GraphQL.
In JSON API, the consumer can interrogate the data with a single query, tracing relationships between objects and returning the desired data in one trip. Imagine searching for a great grandparent with a genealogy system that would only let you find the name of a single family member at a time versus a system that could return every ancestor going back three generations with a single request. To do so, the client appends an include
parameter in the URL. For example:
?include=relationship1,relationship2.nestedRelationship1
The response will include information about four entities:
- The entity being requested (the one that contains the relationships).
- The entity that
relationship1
points to. This may be an entity reference field inside of the entity being requested. - The entity that
relationship2
points to. - The entity that
nestedRelationship1
points to. This may be an entity reference field inside of the entity thatrelationship2
is pointing to.
A single request from a consumer can return multiple entities. Note that for the same API different consumers may follow different embedding patterns, depending on the designs being implemented.
The landscape for JSON API and Drupal nowadays seems bright. Dries Buytaert, the product lead of the Drupal project, hopes to include the JSON API module in core. Moreover, there seems to be numerous articles about decoupling techniques for Drupal 8.
But does resource embedding offer a performance edge over multiple round-trip requests? Let’s quantitatively compare the two.
Performance comparison
This performance comparison uses a clean Drupal installation with some automatically generated content. Bear in mind, performance analysis is tightly coupled to the content model and merits case-by-case study. Nevertheless, let’s analyze the response times to test our hypothesis: that resource embedding provides a performance improvement over traditional REST approaches.
Our test case will involve the creation of an article detail page that comes with the Standard Drupal profile. I also included the profile image of a commenter to make things a bit more complex.
In Figure 1, I’ve visually indicated the “levels” of relationships between the article itself and each accompanying chunk of content necessary to compose the “page.” Using traditional REST, a particular consumer would need to make the following requests:
- Request the given article (node/2410).
- Once the article response comes back it will need to request, in parallel:
- The author of the article.
- The profile image of the author of the article.
- The image for the article.
- The first tag for the article.
- The second tag for the article.
- The first comment on the article.
- The author of the first comment on the article.
- The profile image of the author of the first comment of the article.
- The author of the first comment on the article.
- The second comment of the article.
- The author of the second comment of the article.
- The author of the article.
In contrast, using the JSON API module (or any other with resource composition), will only require a single request with the include query parameter set to
?include=uid,uid.field_image,field_tags,comments,comments.uid,comments.uid.field_image
When the server gets such a request it will load all the requested entities and return them back in a single swoop. Thus, the front-end framework for your decoupled app gets all of its data requirements in a JSON document in a single request instead of many.
For simplicity I will assume that the overall response time of the REST-based approach will be the one with the longest path (four levels deep). Having four parallel requests that happen at the same time will not have a big impact on the final response time. In a more realistic performance analysis, we would take into account that having four parallel calls degrades the overall performance. Even in this handicapped scenario the resource embedding should have a better response time.
Once the request reaches the server, if the response to it is ready in the different caching layers, it takes the same effort to retrieve a big JSON document for the JSON API request than to retrieve a small JSON document for one of the REST requests. That indicates that the big effort is in bootstrapping Drupal to a point where it can serve a cached response. That is true for anonymous and authenticated traffic, via the Page Cache and Dynamic Page Cache core modules.
The graphic above shows the response time for each approach. Both approaches are cached in the page cache, so there is a constant response time to bootstrap Drupal and grab the cache. For this example the response time for every request was ~7 ms every time.
It is obvious that the more complex the interconnections between your data are the greater the advantage of using JSON API's resource embedding. I believe that even though this example is extremely simple, we were able to cut response time by 75%.
If we now introduce latency between the consumer and the server, you can observe that the JSON API response still takes 75% less time. However, the total response time is degraded significantly. In the following chart, I have assumed an optimistic, and constant, transport time of 75 ms.
Conclusion
This article describes some sophisticated techniques that can dramatically improve the performance of our apps. There are some other challenges in decoupled systems that I did not mention here. If you are interested in those—and more!—please attend my session at DrupalCon Dublin Building Advanced Web Services With JSON API, or watch it later. I hope to prove that you can create more engaging products and services by using advanced web services. After all, when you are building digital experiences one of your primary goals you should be to make it enjoyable to the user. Shorter response times correlate to more successful user engagement. And successful engagements make for amazing digital experiences .