A comprehensive guide to HTTP/2

Almost every modern business which has some digital presence use this 20-year-old protocol. It could be a web site, a mobile application, or some standalone application that needs to updated through the internet; HTTP is ubiquitous. Hence, most software engineers out there have to deal with HTTP to fulfil their job. But only a few of them truly understand this remarkable medium of communication.

There are two main reasons for this. Firstly, like any other protocol, HTTP also wrapped with high-level abstractions via modern libraries when it meets the application engineer.
Secondly, people only start to care about it when things go wrong during a production level incident where things are urgently needed to be fixed. At that point, no one has time to learn the essentials of the protocol.
I have seen some experienced engineers who do not understand the basic concepts of the protocol, also expects that the language or framework library will take care of everything for them.

HTTP 1.1

Most of the applications still use HTTP 1.1, and it brought to life to address the shortcomings of the HTTP 1.0. HTTP 1.1 originally defined in RFC 2068 in January 1997 and later has revised a few times. But the HTTP 1.1 properly defined in 1999 by RFC 2616. It was not designed to use the way we use it today. But it brings some worthy features which allow it to survive 20 years even on the face of fast phased evolution of internet-based applications.

  • Connection keep-alive header allows the connection reused for multiple requests. It saves the cost of reopening the connection numerous times.
  • Content negotiation allows the client and server to agree on encoding, language, and type.
  • Chunked responses enable transfer of large documents.
  • Additional cache controlling mechanisms.
  • The addition of the “Host” header allows different domains at the same IP address.
HTTP 1.1 request and response (Image Source : https://developer.mozilla.org/en-US/docs/Web/HTTP/Messages)

Limitations of HTTP 1.1( in the context of the modern web)

Underutilisation of TCP capabilities.
HTTP 1.1 does not take advantage of the congestion control ability of TCP. Transmission control protocol has a mechanism called “congestion control” to determine the optimal size of the window. Usually, it starts with 4–10 packets and increases based on the level of acknowledgment from the server. This “slow start” allows TCP to optimise the data transfer rate according to network performance. However, browsers tend to open more than one connection to fetch the content for rich web applications due to HTTP 1.1’s synchronous behaviour.
Hence the optimisation at the “TCP” layer happened more than one time concurrently, and some may get the optimal setting some may not.

TCP Connections per page with time: https://httparchive.org/reports/state-of-the-web

Not being able to compress headers
Most of the modern web application uses HTTP headers extensively to meet the complex demands of the client application. And the size of the header rises with the increasing complexity of the content.

JSON Web Token (JWT) and OAuth 2.0, and its the latest extension OpenID Connect depends on HTTP headers. Hence, more information flow in HTTP headers than ever before. Therefore, not having a mechanism to compress headers get highlighted more and more.

No request prioritisation mechanism
When you load a web page, its content coming from different APIs and browsers make few distinct calls to back end to get those contents. However, their importance for the overall user experience may not be the same. A bank user may want to see the balance straight away but may not wish to see the notification with equal importance. But due to HTTP 1.1 lack of prioritisation, there is no way for the front end developer to say the balance call is more important than the notification call.

The above limitations get highlighted often than others, but it’s worthwhile to spend some time to understand other limitations of HTTP 1.1 as well.


HTTP 2 introduce new set of abstractions to address the limitations of HTTP 1.1. Before we deep dive in to protocol internals it’s vital to understand the basic building blocks of the protocol.

Binary message frame : The smallest unit of communication in HTTP 2.

Structure of HTTP 2 binary frame (Image source https://httpwg.org/specs/rfc7540.html)

Stream: A bidirectional sequence of frames exchanged between client and server through HTTP connection is called a stream.

HTTP 2 Stream(s) transmitting sequence of frames within a single TCP connection (Image Source https://developers.google.com/web/fundamentals/performance/http2)

Features of HTTP 2.0

Request multiplexing via frames and streams

Together, frames and streams help to reuse an already established connection for multiple request-response streams. When a client wants to call six different endpoints, it merely creates six different streams. Each frame has a stream identifier (A 31-bit integer) with it hence, distinguishing which “frame” belongs to which “stream” is not a problem.

There are ten types of frames in HTTP 2.0 as follows.

HTTP 2 frame types (Source: https://tools.ietf.org/html/rfc7540 )

Note that there are specific frame types for DATA and HEADERS.
Hence as we saw headers and messages in HTTP 1.1, in HTTP 2, we will see HEADERS frames and DATA frames.

Streams in action HTTP 2 (Source: https://developers.google.com/web/fundamentals/performance/http2)

Flow controlling via frames.

In HTTP 1.1 server dominates the data flow. The client had no saying how much data it wants despite being the one who requests data in the first place. Hence the client can ended up being choked by the overflow of data.

HTTP 2 solves this by introducing a specific frame called WINDOWS_UPDATE. This frame can indicate the server that the client is ready to consume more bytes.

Request prioritisation via stream dependency.

HTTP uses a specialised type of frame call PRIORITY. The PRIORITY frame looks like below.

PRIORITY Frame (Source: https://tools.ietf.org/html/rfc7540#section-6.3)

It consists of two required fields “Stream Dependency” and “Weight.” The “Stream Dependency” field communicates the stream that this stream is dependent on, and the Weight field indicates the relative priority of the stream.

Using those fields, clients can communicate request A is more critical than request B.

The server pushed messages.

The server push allows the server to send an object to the client. The server does this via another special frame call PUSH_PROMISE.

This feature plays a vital part in improving the loading time of web sites. The server can act smartly and proactively to send a large object which it knows will need by the client near future.

There are few rules server need to follow to execute server push securely and efficiently.

  • The stream id in the PUSH_PROMISE frame header should always be associated with a request the client has already requested, which stops servers from sending arbitrary messages to clients.
  • The server sent content should be cacheable.
  • The server can only send method headers, which are idempotent.
PUSH_PROMISE Frame (Source: https://tools.ietf.org/html/rfc7540#section-6.3)

Header compression through HPACK

The communication of headers between client and server consumes a large chunk of bandwidth. Modern web applications use header sizes starting from 200 bytes to over 2KB, and use hundreds of requests to load a web page.

In addition to that, modern security frameworks/protocols like OAuth 2.0 / OpenIdConnect use base64 encoded (signed /encrypted) JSON payload for authentication and authorisation purposes. These JSON Web Tokens (JWT) sometimes grow more than 4KB. The only way to solve this by increasing the hard size limit in the application server and apply the same for other layers like proxy and API gateways.

HTTP 2 bring the HPACK

Long ago, HTTP used to compress headers and body both DEFLATE algorithm in the TLS layer. And SPDY introduced a header compression mechanism as well. But both above mechanisms were vulnerable for the CRIME attack.

A CRIME attacker uses the nature of content compression, which happens before encryption. The attacker mixes his data with the actual content at the compression stage and observes the encrypted data.

(encrypted data = encrypt(attackers content + real data) )

then he will be able to figure out the unknown data from the encrypted data set.

The designs of HPACK have CRIME in mind when they design it. Hence, it is resilient against CRIME. HPACK uses the following methods for compression.

Static Table: The static table consists of a predefined static list of headers field. It contains 61 commonly used header filed with their predefined values.

Dynamic Table: As the name implies, this table is dynamic and works as a queue. It work in first-in-first-out mode, but contains the actual headers found during the connection. This table can contain duplicate entries.

Huffman Encoding: A set of Huffman codes used to encode any string key, value pair.

When compression happens, HPACK follows the following steps.

  1. Lookup in the static table and dynamic table, if it can find the key, value pair it will reference it from the table.
  2. If not, then it will try to find a header with a key. Then the key will be reused.
  3. If none of the above worked, then it will encode the key value using Huffman encoding.
HAPC in action HTTP 2 Source https://developers.google.com/web/fundamentals/performance/http2

This article discussed the features of HTTP 1.1 and drawbacks it posses at the face of the modern web. And the capabilities and execution details of HTTP 2.
Knowing your transport layer internals gives you significant advantages when designing and developing a distributed system.





Senior Software Engineer | Cloud | API | System Design