The Secret Sauce Behind Building an Elastic API
If you're using or currently plan on using a CDN provider with a web service API, here's a look at how to achieve flexibility
Learn more about the companies mentioned in this article in the Sourcebook:
The Xumo platform is built on a number of discrete application programming interface (API) and representational state transfer (REST) services designed with flexibility to support both proprietary internal clients here at Xumo as well as the external clients of our partners. We've taken great care and attention to design our API in a manner that is not prescriptive to any single client, but instead to offer a capable and flexible environment that can cater to the differing implementations of our clients.
The issues I address below are applicable to any online platform that is currently using or considering using a CDN provider with a web service API; I got the idea to write this article based on interactions with colleagues from other companies who were dealing with similar issues.
One aspect that ranks highly when developing an external API is how to define the contract between the API and the client, and how to manage changes to that API. Changes to that contract don't happen without a certain amount of pain, and so the topic of API versioning arrives quickly. No API is perfect on the first pass, and ours is no exception.
Once we have a client using our service we have a commitment to provide a service with expected behavior. Even when our internal clients need a revised response format we must maintain backwards compatibility against the existing API. The problem can be split into two separate areas. The first is the internal mechanism to support multiple versions in the codebase, and the second is the external API contract between the client and server. There is a clean layer of separation between the two concerns. The internal mechanism should not be visible (at all) to the client, and so we should be free to implement whatever approach is deemed best for the language and framework at hand. It should not bleed into the API at all.
When it came to the external design of the Xumo API, we went through a few different iterations. The first API was borne out of an adoption of HATEOAS or Hypermedia as the Engine of Application State. Hypermedia is complex enough to justify a post all by itself, so I'll cut to the chase by highlighting that it relies on the client entering the API through a single fixed endpoint and 'discovering' available API functionality from the responses received, which can vary on a client-by-client basis. A typical Hypermedia response might return the available resources, along with the media representations that relate to those resources. It's in this area where we cater to multiple versions.
Here we have an API response that indicates we can receive this resource in a version 1 or version 2 representation. We could also specify an XML or JSON response if required. To request a version 1 XML response a client might make the following request:
Content delivery network (CDN) implementations can also differ in how they perform. By default, it's typical to only consider a 'Vary: Accept-Encoding' response. For all other Vary values, default Edgecast behavior is to cache the first response sent back from the origin whereas Akamai will not cache anything (other than ‘Vary: Accept-Encoding'). In the Edgecast case you may experience uncertain behavior. For example, a client may request a version 1 resource but receive a version 2 resource if that was the first response that the CDN had the opportunity to cache. Even though the server specified 'Vary: Accept,' the CDN would ignore it entirely and behave as if the Vary header wasn't specified at all. Depending on your use-case, this may prove to be a deal-breaker.
In the Akamai case it's arguably not so bad, but the behavior can still be undesirable. Akamai will not cache the incorrect response, but will instead go back to the origin every single time. The client can rely on receiving the correct response each time, but the origin can become overloaded, as the CDN is not caching anything.
While it is possible to create custom configuration for these CDNs to get the behavior needed, we made the decision to move away from content-negotiation for our versioning strategy. Any custom rules set up with one CDN binds us to that provider and increases the cost of switching to another—which is not to be ignored.
The next versioning approach we considered was custom HTTP headers.
Custom HTTP Headers
This would be something like:
Unfortunately, this also has many of the same downsides as the content-negotiation strategy. The server would need to specify 'Vary: X-Api-Version' (which still gets ignored by the downstream caches) and the versioning contract has to be explained to the clients ‘out of band.' Like the HATEOAS, content-negotiation approach as an integrator cannot simply put the request in a browser for easy testing. As a result, this approach was quickly dropped.
Path Parameter Versioning
The final approach was to incorporate the versioning information in the URL (either query or path parameter). We adopted a path parameter approach.
This is adding an explicit version into the URL to specify the requested version, e.g.:
As you can see, it's clear what version has been requested—and it's mandatory—so the client must include it (a good positive over a query parameter based approach). Additionally, the API can be used directly in a browser, if required.
The client specifies the response format by way of using either ‘.xml' or ‘.json' as the end of the path which CDNs can handle with no explicit or custom configuration required. They are completely portable between CDNs.
At the time of writing, the following companies are using a split of versioning approaches:
Amazon EC2 (API version is YYYY-MM-DD format)—http://docs.aws.amazon.com/AWSEC2/latest/APIReference/CommonParameters.html
Azure (also supports custom header)—https://msdn.microsoft.com/library/azure/dd894041.aspx
About the Author
Sam Hazim is lead server engineer at Xumo, based out of Irvine California. Working on services that get used by all classes of connected devices, Sam is passionate about building powerful, scalable systems that unlock the potential of their hardware. His core interests involve web services, performance optimization and service security.