The Algorithm Series: The Math Behind the CDN
All of the above leads us back to consistent hashing and how it makes use of modulo mathematics as part of a larger algorithm intended to first avoid "hot spots" or heavy loads on origin servers and also to more effectively distribute content requests among a "plurality of resources" on which U.S. Patent No. 8458259 is based.
Hashing has been used in various ways for streaming, not just for CDN storage solutions. The open source hashing tool xxHash, for instance, is being used for file transfer by Netflix, has been integrated into the Microsoft Azure platform, and is also in use in databases, such as MySQL, where manifest files and user account information are typically stored.
The more simplistic the content and the key structure, the higher the likelihood potential collisions could occur, so a number of hashing solutions have gone to long alphanumeric strings. Some also include encryption, in which the initial values are scrambled with a known cryptographic key.
Another issue with hashing is that it is not random, which means that it is not foolproof in avoiding collisions. But even true randomization of content placement on servers also has the potential of collisions, since random number generation could result in two or more pieces of content being assigned the same whole number. Having said that, there are ways to use hashing as a decent substitute to truly random content placement on one or multiple servers while still avoiding collisions.
While the caching server stores a copy of content replicated from the origin server, it also only has a limited amount of storage space. As such, long-tail or less-popular content (as determined by a Bloom filter algorithm) may be eliminated from a caching server closest to the end user. In this case, a request to the caching server results in a cache miss, and the caching server has to contact the origin server to meet the end user's request.
We'll come back to the cache hit-or-miss ratio concept in a moment, but let's stop here and note one other point made in the patent:
There is a tradeoff of a greater performance benefit if more clients share the same cache server that is balanced against an increased likelihood that the cache server itself will become overloaded. One approach to solving this dilemma is to use a group of servers that function together as a cache.
The patent application goes on to note, though, that just throwing more servers at the problem in a linear fashion won't solve it. We'll get to the suggested approach in a moment, but for now, let's consider how content is searched across a cluster of caching servers at a given location.
Rather than requiring an end-user request to traverse all the servers in a cluster, a combination of hashing and modulo can help limit the number of servers searched, which lessens the overall search complexity and time.
Earlier in the article, I mentioned that modulo remainders help determine which server a given piece of data might be closest to. There's a good explanation on the Carnegie Mellon University website, but let's use our Social Security number example to illustrate the point of modulo, hashing, and mapping of caching server storage.
We will assume for this example that there are 19 servers. We're using 19 since it is a prime number. As a prime number in modulo, it is less likely to return a remainder of zero, meaning that most content will be mapped to servers 1–9.
Let's use John Fitzgerald Kennedy, whose presidential library website lists his Social Security number as 026-22-3747. If we strip the dashes, we come up with a hash of 026223747. To determine which of the three servers content for hash 26223747 resides on, we would use the following formula:
26223747 mod 19 = 4
So content about John Fitzgerald Kennedy would be found on caching server 4 in this particular server cluster. If the content were not there, the request made to the origin server would not only return the content to the end user but also return the content to caching server 4 for subsequent user requests. One reason the content may not be available on a given server, even if it were just a few days ago, is the use of business rules such as Bloom filtering, which actively removes less-popular content from the caching server to make room in the cluster's storage for more popular content.
The modulo approach to caching is simple but effective enough that it's also used in programming languages like Java, which utilizes mod 31 for calculations. Besides the fact that 31 is a prime and an odd number, it is also a power of 2 minus 1 (32 is 2 to the fifth power). Given the fact that computing works best in powers of 2 (including 32-bit and 64-bit), this means that 31 is also computationally less taxing.
OK, so we know which caching server the content is supposed to reside on. But what happens if the number of servers in a caching cluster changes, either by a server failing or a server being added? If we dropped from 19 servers to 18 servers, the formula for our hash example would be the following:
26223747 mod 18 = 15
So any requests to server 4 for the content might be wrong. Worse, though, in a traditional server cluster, all hash keys mapped to all caching servers would need to be recalculated. That's a lot of work for just one server failing.
Finally, it's also worth noting that hashing tables aren't required, but they speed up searches on a single server and act as a road map to which server in a multi-server cluster contains the desired piece of information.
Circling the Caching Cluster
The idea of consistent hashing is to use a physical cluster of servers (as might be found in a point of presence) in a logical "circular" or "round robin" pattern (which is why consistent hashing is also sometimes referred to as "ring hashing").
Think of all the equations you had to do with a circle in high school algebra. The circle was divided into 360°. Now instead of degrees, divide that circle into thousands or hundreds of thousands of slivers or shards, just like you would divide a single pumpkin pie into 12 slices if 12 Thanksgiving dinner guests all wanted a slice. Every one of the 12 would get a smaller piece of pie than if you'd only had to divide it among eight guests, but every guest still gets at least a piece of the pie.
The same concept is true for servers (n) in a ring: Add an extra server, and the server load potentially decreases by a percentage of the extra server (n +1). So if you've got 13 servers, each one bearing 1/13 (or 7.692%) of the total server cluster's load, and you add an extra server (1/13+1 or 1/14), then each server only needs to bear 7.142% of the total server cluster's load.
But how to evenly parse the load across the available servers? This is where consistent hashing comes into play, as delineated in the 1997 paper as well as the MIT patents mentioned at the beginning of the article.
Consistent hashing is a powerful tool not just for initial caching but also for scenarios in which the quantity of servers in a server cluster frequently changes. It's also been used in solutions such as Amazon's DynamoDB and Big Data database cluster models like Cassandra and Riak.
The most rudimentary form of ring hashing places servers equidistant from each other on an imaginary circle. If there are three servers, for instance, they could be manually spaced 120° apart (think of a clock where servers would be placed at 12 o'clock, 4 o'clock, and 8 o'clock). As a way to further automate the addition or deletion of servers, some form of unique number can also be used and then tied to a particular point on the ring.
A diagram illustrating ring hashing of documents and servers, adapted from U.S. Patent No. 8458259, Method and Apparatus for Distributing Requests Among a Plurality of Resources (go2sm.com/cdnpatent)
For instance, if the static IP address of a server is 192.168.1.101, we could use a hash of 1921681101 to place that server at a particular point on the ring. If the server needs to be replaced, however, the new server would have to use the same static IP address.
Better still, the imaginary ring allows consistent hashing to abstract the number of servers. Instead of mapping the servers to the ring and the content hashes to the servers (the number of which can change), an even more effective way is to map both the content hashes and the caching server hashes to the ring itself.
Going back to basic high school mathematics, we know a circle spans from 0° to 360°, and we can derive the maximum amount by using the formula 2π radians. We then assign the hash to the output of the formula at some degree or portion of a degree between 0° and 360°.
A diagram illustrating inserting a new server or document on the ring (aka consistent hashing), adapted from U.S. Patent No. 8458259
By distributing all the hashes in a hash table to a position on the circle, the only calculation that needs to be performed as the number of servers rises or falls is whether or not a given server is suddenly closer to the hash's assigned position on the circle. And since the server's hash is based on something other than manual spacing, the overall index for every single caching server in the cluster would not need to be calculated.
Now what about collisions? While it's unlikely to generate as many collisions as a non-ring-based hashing—and certainly will require fewer recalculations of the hashing table—a manually implemented ring-hashing approach still has the potential to generate collisions if content placement is arbitrary.
To combat this, the consistent hashing approach of MIT's patents uses a clockwise methodology to map or remap hashes of both content and servers. If the locations of the server and content do not share the exact same hash (i.e., same location on the ring), then the caching solution looks for the next closest server in a clockwise direction and associates the content hash location to that nearby server hash location.
A new server can be added anywhere in the ring, and the only updates to hashing tables that will take place are the ones between the new server and the closest existing server prior to (or counterclockwise from) the new server. Content hashes that are clockwise from the new server will continue to associate with the server they were already assigned to. If we use an example of adding a single server to an existing cluster of seven servers, for a new total of eight servers, the percentage of content-server hash combinations that would need to be updated is in the single digits.
While there's a way to figure out the probability of what content-server hash combinations would need to be updated, we'll just approximate it this way: 100% of combinations divided by seven servers equals a maximum potential of 14.28% of all content-server hash combinations needing to be updated—a far cry from the need to remap 100% in the other examples provided earlier. In all likelihood, an average of half of the 14.28% (or approximately 7%) of content-server hash combinations would need to be updated, which requires minimal resource time and lowers cache-miss errors.
Some Servers Are More Equal Than Others
If all servers are of equal CPU, RAM, NIC card, and storage capacity, this approach works well. But what if a new server being added is twice as powerful than the older existing caching servers? In this case, weighting is used, in which a twice-as-powerful server is assigned twice the weight of one of the older servers.
Let's say we have 19 existing servers and add one twice-as-powerful server. While the total number of servers is 20, using weighting, we actually look at both probability and the ring-hashing positions of 21 servers (20 physical, one virtual). So the ideal load would be 4.76% of all content storage and requests for any given server in the 21 physical and virtual servers, rather than 5% for the 20 physical servers. And this increases the probability that the more robust server will store twice as many pieces of content—and get twice as many requests for that content.
This virtualization via weighting plays a more significant factor when it comes to placing caching servers on the ring. These virtual servers can be used to lessen the number of degrees between caching servers.
The beauty of using hashing to place the servers on the ring is that the virtual servers don't need to be placed adjacent to the physical servers along the ring. This means that a single physical server could be spread across the ring as a number of virtual servers, regardless of whether weighting is used or not.
Bringing It All Together
The intent of the MIT patent previously mentioned was threefold: prevent swamping, minimize cache memory requirements (not requiring any cache to store a large number of pages), and minimize the delay a browser experiences in obtaining a page. As we've seen in the basic examples, the algorithms behind consistent hashing accomplish all of these. For anyone interested in applying the concepts discussed in this article to a pilot CDN, there are two points worth noting.
First, the patents for consistent hashing, from U.S. Patent No. 8458259 all the way back to the original Patent No. 6430618, filed on March 13, 1998, are now listed as "Expired—Lifetime" on the U.S. Patent and Trademark Office (USPTO) website.
According to the USPTO, the term for a patent begins "on the date on which the patent issues and [ends] 20 years from the date on which the application for the patent was filed in the United States or, if the application contains a specific reference to an earlier filed application or applications … from the date on which the earliest such application was filed."
What's interesting to note, though, is that patents issued from design patent applications filed before May 13, 2015, are different, in that they have a term of 14 years from the issue date. Regardless of the fact that U.S. Patent No. 8458259 was granted in 2013, because it is a continuation of U.S. Patent No. 6430618, the patent around consistent hashing has expired.
Second, while it is true that a number of other algorithms, such as cut-through caching developed by Vivek Pai and his team at Princeton and later acquired by Akamai, have enhanced the underlying premise of consistent hashing, it's also worth noting that the "simple" idea presented in 1997 of placing both content hashes and server hashes in an imaginary circle fundamentally changed content delivery.
More importantly, it continues to have a profound effect on the way we'll distribute streaming content in 2020 and beyond. For further insight into the math behind the magic, I'd highly recommend you take the time to read the patent and a few key explanations of the continued advancement of ring hashing at go2sm.com/consistenthashing or go2sm.com/hashingtradeoffs.
Will 2022 be the year UDP finally shows its streaming mettle? The Quick UDP Internet Connections (QUIC) protocol might make the difference. First, though, OTT platforms need to make technical decisions about HTTP/3 that could further fragment the market.
There are four key delivery methods for live content, and understanding the math behind them can help decision makers determine which is the best for their applications.
Streaming is still only a fraction of total video watched, even post-COVID lockdown. What happens when it's the de facto method for delivering video content? Will the existing CDN approaches be enough to handle the next phase of streaming growth?
Multiple algorithms for encoding, transfer, and playback intersect in the end user's player. So how can you make the numbers work in your favor?
With a clear understanding of traffic, key performance indicators, and what you need to store on an origin server, a multi-CDN strategy can improve a better experience.
The event brings together all the players in the content delivery sector—from power and interconnect to quality of service and edge compute—in the only conference of its kind. If you're interested in speaking, read on.
Companies and Suppliers Mentioned