This blog post is part of an ongoing series where we will discuss a wide range of H2-related topics.
In today's post, the first of a two-part "series within a series", we will discuss the challenges and importance of prioritization of streams in leveraging HTTP/2 to achieve performance improvement.
HTTP/2 (also known simply as H2) captures the lessons learned from what has been lacking with HTTP/1.1 and brings much needed innovation to the core protocol that drives the web. This is a 2-part blog post. In the first blog post we will talk about importance of prioritization of streams in HTTP/2. In the second blog post, we will talk about how Akamai's proxy handles stream dependency priorities to ensure the best possible performance.
What is Stream Priority
HTTP/2 allows for multiplexing of multiple streams (a HTTP/2 stream is equivalent to a request in HTTP/1) on a single TCP connection. According to the RFC7540 specification, an HTTP/2 client SHOULD always open only a single TCP connection per-domain. This is a step away from HTTP/1 use-cases where clients were opening 6-8 TCP connections to improve performance. There are number of novel concepts that are introduced in the H2 specification to make this transition from multiple TCP connections to a single TCP connection possible without loss of performance. One such novel concept is the ability to send "hints" indicating the importance of a given stream (priority) relative to other streams on the same connection, so that resources can be allocated appropriately. In addition to traditional numerical priority weight, an H2 stream can also indicate the notion of dependencies.
A restaurant analogy
Let's use the example of a restaurant taking an order to understand the importance of assigning correct weight and dependency. On a HTTP/1 restaurant, the kitchen would just prepare meals in the sequence in which dishes were ordered by the waiter. On an HTTP/2 restaurant, the waiter could send "hints" to the kitchen regarding the particular sequence that a table should receive the dishes, and therefore the sequence in which the dishes should be prepared (for example appetizers should come before salads, and salads before the main courses).
Let us take this restaurant analogy and mock an order request. The arrow shows the request dependency and the number in parentheses shows the weight.
The above example introduces the concept of grouping: Soup, salad, appetizer, and entrée (streams) are not actual physical items being requested, but merely grouping construct to help with prioritization. HTTP/2 also allows for such a grouping construct and the ability to assign weight to it. The idea is to be able to quickly change the priority of a collection of streams.
One way to calculate priority is to set an imaginary limited resource unit. For our example let us use 1000 resource units. Appetizers have a weight of 40 out of 100 (30 + 30 + 40) for that table. That gives appetizers 40% of the 1000 resource unit. Entrées depend on the appetizer so it gets 30% share of 400 units, which comes out to be 120 resource units. Pasta and Steak each get half of that - making it 60 units each.
And here is where this restaurant analogy breaks down. The HTTP/2 specification says to use the weight and dependency to allocate resource for requests. This does not mean it will block a request until all higher priority requests are done. And further, these hints are just that: hints. The server can choose to ignore it. If an object is available and can be served there is no reason why it should wait.
Let us take a look at how the browser gives out priority hints. Chrome 49.0 does not sends dependency tree but opted to just send weighted information. Firefox is one of the few web browsers that implemented the priority with dependency tree. The figure below shows the prioritization hint that Firefox 44.0.2 sent to a web site.
The circular node shows the grouping construct similar to our restaurant example. These constructs, which are also known as anchor streams, do not have a request associated with it. These anchor streams get constructed at the start of each connection. The actual request starts at Stream ID 13 for the base page where it will spawn the rest of the requests - stream ID 15 to 141.
The concept itself is relatively straightforward on paper. But like most things, the implementation will not be so easy. From the server perspective, we were stuck with the following questions:
- What resource do we want to restrict based on prioritization?
- How much change can we affect based on those hints?
- And how would we measure the satisfaction of patron?
Decisions have to be made early just so we could start somewhere. For simplicity, we decided to restrict the system write command based on priority of data frames. At any point in time we will have a set of data frames available to be sent out. We sort this set by priority before sending it out to ensure the write will send the higher priority data frame first.
As for measurement, we know that the improvement has to be measured by something other than total time served. If things were done right the total time served should be the same no matter which sequence you served them. We know it should be something on the client (browser) end because, after all, the client sends the hint in order to optimize some property on its end. The browser onLoad time became our measurement of user satisfaction.
In our next post, we will talk about the engineering work that Akamai has invested to handle stream dependencies to ensure the best possible performance.
To learn more, go to: https://http2.akamai.com/
Additional reading on HTTP/2: