Skip to main content

An Introduction to Shared Compression Dictionaries

· Updated on · 23 min read
Anna Monus

Shared compression dictionaries introduce a new way to speed up your website by reducing the amount of data that needs to be transferred for repeated visits.

In this article, we'll take a look at how this technology works and how you can start using it on your website.

What Are Shared Compression Dictionaries?

When a website loads files over the HTTP protocol, these files are typically compressed to reduce the amount of data that's transferred. That in turn means that less time is spent downloading the file.

For repeated visits, files that have been downloaded previously can be served from the browser cache. However, developers continuously make changes to these files, and they need to be downloaded again, even though the files are very similar.

Shared compression dictionaries solve this problem. Even if a file has been updated, some previously downloaded data can be re-used, thus reducing the amount of data that needs to be transferred on a subsequent visit.

info

Shared compression dictionaries reduce page weight and page load time after the initial visit.

What Are Compression Dictionaries?

Compression dictionaries contain a list of terms or phrases that are frequently used in a file.

By default, text compression replaces common terms with shorter tokens, preventing the repetition of each instance. The dictionary speeds up text compression by including a list of these terms, so the encoder doesn't have to find them independently.

With newer HTTP file compression algorithms, the server can use the dictionary to compress resources and the browser can use it to decompress them more effectively.

Sharing Compression Dictionaries Between Files

A shared compression dictionary is a separate file that consists of a list of text strings (e.g. HTML markup, CSS declarations, scripts, etc.) that appear in either the different versions of the same resource (e.g. the subsequent versions of a JavaScript library) or similar resources hosted on the same domain (e.g. single product pages or blog posts).

Shared compression dictionaries extend two popular HTTP compression algorithms: Brotli and Zstandard. However, you can't use them together with GZIP.

The web performance benefits of shared compression dictionaries are impressive — in some cases, they can reduce the download size of a text or binary file by more than 90%.

tip

Shared compression dictionaries will be released as part of Chrome 130 in October 2024.

A Technical Look at How Compression Dictionaries Work

Shared compression dictionaries are defined by an Internet Engineering Task Force (IETF) draft called Compression Dictionary Transport as:

"...a mechanism for dictionary-based compression in the Hypertext Transfer Protocol (HTTP) ... [that] extends existing HTTP compression methods..."

A shared compression dictionary is a standalone file that reduces the download size of compressed resources by providing the encoder with a pre-made list of common strings shared across various files on the same domain. The encoder uses this dictionary to tokenize text in a more efficient way than it would do without it (below, we'll see how this works in detail.)

You can use shared compression dictionaries with either the Brotli or the Zstandard compression algorithm in the following way:

  • Brotli supports shared compression dictionaries up from version 1.1.0. The algorithm also includes a built-in static dictionary that you'll use in any case. If you decide to add a shared compression dictionary, the external dictionary will be used along with Brotli's built-in dictionary.

  • Zstandard doesn't have a built-in dictionary, but it supports shared compression dictionaries since its initial release. You can generate custom dictionaries by running its encoder in training mode with your sample data.

Adding a shared compression dictionary to your compression operations leads to significant file size reductions in many cases (however, not always) — below, we'll see how.

warning

At the time of writing, the Compression Dictionary Transport IETF draft is still in progress and subject to change.

Text Compression With and Without a Shared Compression Dictionary: What's the Difference?

The two diagrams below are from the "Compression Dictionaries" talk of Pat Meenan, co-author of the Compression Dictionary Transport specification, that he presented at the performance.now() 2023 conference.

In the first diagram, you can see how Brotli compression works without a shared compression dictionary:

HTTP text compression without shared compression dictionary, diagram

The second diagram shows how Brotli works when it's used along with a shared compression dictionary:

HTTP text compression with shared compression dictionary, diagram

As you can see above, both techniques work by tokenizing common strings and then referencing the tokens (i.e. *, &, and $).

However when using standalone compression, the encoder first needs to identify the patterns, while the shared compression dictionary provides a pre-defined list of the common strings, so in most cases, the compression process will take less time.

File Formats for Shared Compression Dictionaries

Compression Dictionary Transport is a truly flexible protocol.

You can use any text-based file format that can be sent with the HTTP protocol, such as HTML, CSS, JavaScript, WebAssembly, etc., as a shared compression dictionary. Or, if you want to use a dictionary that includes strings from more than one file type, you can also create a simple .dat or .txt file.

Types of Shared Compression Dictionaries

As I mentioned above, shared compression dictionaries can be used in two ways:

  • Static compression (also known as delta compression) is used for files that only change with new releases, such as CSS and JavaScript files. A static compression dictionary is an earlier version of the resource, marked as a dictionary in the response header during a previous session. The browser caches this file as a potential dictionary so it can later use it to decompress subsequent versions of the resource that have been compressed on the server using the same file as a dictionary. Delta compression can be pretty effective because the two subsequent releases will share many common parts.
  • Dynamic compression is used for dynamically generated files that frequently change or display different content to different users, such as an HTML blog home page or an eCommerce product page. For dynamic dictionaries, you need to create a dedicated dictionary file that includes all the common strings that files matching a pre-defined URL pattern may contain, and then send it to the browser during idle time.

We'll look into how to implement both techniques below, but first let's see how shared compression dictionaries impact web performance.

Shared Compression Dictionaries and Web Performance

Shared compression dictionaries improve page load times by reducing the download size of resources transmitted over the network, allowing them to reach the user's browser faster.

Above a certain compression level, they'll improve any web performance metric that depends on page load time, including the Core Web Vitals. The compression levels for which it's worth using a shared compression dictionary are as follows:

  • Brotli:
    • It has 11 compression levels (between 1 and 11).
    • Shared compression dictionaries improve the compression ratio up from Brotli 5 (i.e. Br 5 - 11).
    • Below Brotli 5, the difference is zero to minimal.
  • Zstandard (Zstd):
    • It has both negative and positive compression levels (between -7 and 22).
    • Shared compression dictionaries improve the compression ratio at any positive compression level (i.e. Zstd 1 - 22), but the difference is only significant up from Zstd 3+.
info

A lower compression level means a faster compression operation on the server, but the compressed file will be larger in size. A higher compression level generates smaller files, but it results in higher CPU time on the server.

A Real-World Example

Now let's see an example of how shared compression dictionaries impact file size across the different compression levels of Brotli and Zstandard.

I used the delta compression tester on the Use As Dictionary test site (also created by Pat Meenan) to compress the latest version of Bootstrap's minified CSS file (v5.3.3.) using an earlier version of the same file (v5.2.3.) as shared compression dictionary.

The delta compression tester returned the following data (for reference, the comparison table below also shows the download size when the file was compressed with GZIP 9, which is the highest compression level for GZIP):

Delta compression at different compression levels with and without a shared compression dictionary, with Brotli and Zstandard

tip

In this example, shared compression dictionaries reduced download size by up to 76%!

As you can see above, using an earlier version of the minified CSS file (i.e. bootstrap.min.css?v=5.2.3) as a shared compression dictionary has improved the download size of its updated version (i.e. bootstrap.min.css?v=5.3.3) above the Brotli 5+ and Zstandard 1+ compression levels.

Here's the above chart in a table format, too:

Compression levelDownload size without shared compression dictionary (bytes)Download size with shared compression dictionary (bytes)Download size ratio
GZIP 930,767n/an/a
Brotli 139,92939,929100.00%
Brotli 334,92334,923100.00%
Brotli 528,2507,21725.55%
Brotli 726,9537,04926.15%
Brotli 1122,7096,11126.91%
Zstd 135,64224,76169.47%
Zstd 234,24921,79763.64%
Zstd 334,32112,92937.67%
Zstd 529,9829,41031.39%
Zstd 729,9799,30431.04%
Zstd 1028,3037,91727.97%
Zstd 1926,0356,11823.50%
Zstd 2226,0376,09423.41%

Note that the Download size ratio column above is the ratio of the compressed size of the bootstrap.min.css?v=5.3.3 file with and without using the bootstrap.min.css?v=5.2.3 file as a shared compression dictionary, with the same compression algorithm, at the same compression level.

If you wanted to calculate the total compression ratio, you would need to divide the respective download size by the size of the uncompressed bootstrap.min.css?v=5.3.3 file, which is 232,803 bytes.

tip

Bootstrap releases new versions fairly infrequently. If you release new versions of your CSS or JavaScript files more frequently (e.g. daily), the web performance benefits of delta compression with a shared compression dictionary will be even more significant.

Background: Compression versus Minification

Minification and compression are not the same thing.

Minification removes unnecessary characters, such as whitespace, comments, and semicolons, so you can upload the text file in the most compact but still human-readable format to your server (e.g. bootstrap.min.css). You can minify your text files using a tool such as minifier.org or a module bundler such as webpack or Rollup.

Compression is performed on the server using an algorithm such as GZIP, Brotli, or Zstandard. The encoder uses one of these algorithms to find and tokenize common strings, and then convert the tokenized text to a binary format. For example, when Bootstrap's minified CSS file is compressed, it's sent to the browser in a binary format such as bootstrap.min.css.gz, bootstrap.min.css.br, or bootstrap.min.css.zst that the browser decodes back to bootstrap.min.css.

info

Compression reduces file size to a much greater extent than minification.

Static Dictionary Compression (a.k.a. Delta Compression)

Now, let's see what static compression looks like in code.

The biggest web performance advantage of delta compression is that you don't need to create and send a dedicated dictionary file which is solely a dictionary (which is the case with dynamic compression), but you mark a file that you would send to the browser anyway as a shared compression dictionary.

For the new release of the resource, you'll just need to send the dictionary-compressed version of the changes (a.k.a. the delta).

1. The Server's Initial HTTP Response Header (with Use-As-Dictionary)

To mark a resource (say, the minified CSS file of Bootstrap v5.2.3.) as a shared compression dictionary, you'll need to add the Use-As-Dictionary header to the HTTP response header:

HTTP/2 200
Date: Tue, 01 Oct 2024 16:29:05 GMT
Content-Type: text/css
Content-Length: 23707
Connection: keep-alive
Server: Apache/2.4.41 (Ubuntu)
Use-As-Dictionary: match="/bootstrap@5.*/dist/css/bootstrap.min.css"
Content-Encoding: br
Cache-Control: max-age=2592000
Vary: Accept-Encoding

This is an ordinary HTTP response header that sends the Brotli-compressed CSS file to the browser (if your file is compressed with Zstandard, you'll need to set the Content-Encoding header to zstd).

The Use-As-Dictionary header allows you to inform the browser that it can use this file as a compression dictionary in the future for files that match the URL pattern specified by its match parameter.

The match parameter is a required parameter of Use-As-Dictionary. It accepts the URLPattern syntax, which allows the usage of the * wildcard, which makes it possible to specify flexible routes. You can define only one path in match, as each shared compression dictionary can only apply to one URL pattern.

The Vary header prevents caches from storing this Brotli-compressed version of Bootstrap's minified CSS file if the client doesn't support Brotli. We'll discuss below how Vary works with shared compression dictionaries in more detail.

info

For privacy and security reasons (see below), you need to serve your shared compression dictionaries from your origin domain.

Or, if you want to serve them from a third-party CDN, you'll need to add an Access-Control-Allow-Origin header to the HTTP response.

2. The Browser's Subsequent Request Header (with Available-Dictionary)

Now, the browser has stored the bootstrap.min.css?v=5.2.3 file in cache as a potential shared compression dictionary for your origin domain (due to cache partitioning, it won't be available from other domains).

A few days or weeks later, when the user revisits your site and the browser receives the HTML page, it recognizes that now it uses a new version of the Bootstrap library (i.e. it needs to request bootstrap.min.css?v=5.3.3 from your server instead of loading bootstrap.min.css?v=5.2.3 from cache). However, the browser also recognizes it can use bootstrap.min.css?v=5.2.3 as a shared compression dictionary for the new release.

So, it includes that information in the HTTP request for the CSS file using the Available-Dictionary header:

GET /bootstrap@5.3.3/dist/css/bootstrap.min.css HTTP/2
Host: www.your-site.com
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36
Accept: text/css,*/*;q=0.1
Accept-Encoding: gzip, br, zstd, dcb
Available-Dictionary: sha256-c916a28ff1803a3b4e5b19dfa65b3c39afefe27451c49563ce7babe516a47adf
Connection: keep-alive

The Available-Dictionary header specifies the SHA-256 hash of the dictionary so your server can check its integrity.

Since the browser can only propose one shared compression dictionary for a request, if it has more than one dictionary in cache that matches the URL pattern, it will pick the best match (which will be the longest match string, as most likely that will be the most specific resource).

info

You can have as many shared compression dictionaries as you want.

You may have noticed that the Accept-Encoding header includes the dcb value, which stands for Dictionary-Compressed Brotli and informs your server that the browser supports Brotli dictionary compression. If the browser also supports Zstandard dictionary compression, it will also add the dcz to Accept-Encoding, which stands for Dictionary-Compressed Zstandard.

3. The Server's Subsequent Response Header (with Vary)

If your server can identify and validate the proposed shared compression dictionary through its SHA-256 hash, it will compress the new version of Bootstrap's minified CSS file using the previous version as a dictionary.

It will send only the delta in the response, which it indicates by setting the Content-Encoding header to dcb:

HTTP/2 200
Date: Fri, 11 Oct 2024 09:41:34 GMT
Content-Type: text/css
Content-Length: 7049
Connection: keep-alive
Server: Apache/2.4.41 (Ubuntu)
Content-Encoding: dcb
Cache-Control: max-age=2592000
Vary: Accept-Encoding, Available-Dictionary

The above response also includes the Vary response header to ensure proper caching behavior.

Vary allows you to list the request headers the client must support to be able to use (e.g. decode) the file sent in the response. Or, to put it differently, the contents of the response vary based on the presence or absence of the listed headers.

By assigning Accept-Encoding and Available-Dictionary to Vary, you prevent caches (e.g. intermediary caches, CDN caches, browser caches, etc.) from serving the dcb resource for future HTTP requests that don't include the headers listed in Vary, which indicates that the client either doesn't support dictionary compression with Brotli or doesn't possess the relevant dictionary.

As the IETF draft states:

"If the response is cacheable, it MUST include a "Vary" header to prevent caches serving dictionary-compressed resources to clients that don't support them or serving the response compressed with the wrong dictionary: Vary: accept-encoding, available-dictionary."

Dynamic Dictionary Compression

Dynamic dictionary compression can be used for dynamically generated resources that have a similar structure but their contents are different, such as HTML pages using the same template, API calls, or JSON responses. These files typically contain identical elements (e.g. headers and footers), partially overlapping content (e.g. common key names in JSON), and unique elements.

For dynamic resources, you need to create a dedicated dictionary file that the browser can fetch from the server when the page is idle. Then, the browser can use this dictionary to decompress files compressed with the same dictionary on the server.

The following diagram is from Pat Meenan's aforementioned "Compression Dictionaries" talk and shows how shared compression dictionaries work with dynamic resources:

Dynamic dictionary compression diagram

In the diagram above, green indicates the identical parts, blue indicates the similar parts, and red indicates the unique parts of each resource. The green and blue parts can be extracted into the shared compression dictionary, while the red parts will be included in the delta.

Compressing dynamic resources with a dedicated shared compression dictionary can significantly reduce page weight on websites that include many instances of the same content type, such as blogs with several posts or eCommerce stores with multiple product pages.

As communication between the browser and server works similarly to delta compression, below, we'll just look into the additional details you need to know.

1. Define the Location of the Dedicated Dictionary

The Compression Dictionary Transport IETF draft defines a new link type for shared compression dictionaries: rel="compression-dictionary". It lets you inform the browser of the existence and location of the dedicated compression dictionary within either the HTML file or the HTTP response.

HTML Implementation

The HTML version works similarly to resource hints. You need to add the rel="compression-dictionary" attribute to the <link> tag and set the href attribute to the absolute or relative path of the dictionary file:

<link rel="compression-dictionary" href="/dict.dat" />

The compression-dictionary resource will be fetched at low priory during idle time. Since it's just a hint and not an instruction, browsers may or may not request the dedicated dictionary file.

warning

While compression-dictionary is similar to other resource hint link types (e.g. they download at low priority), avoid prefetching or preloading dedicated dictionaries, as browsers that don't support shared compression dictionaries may download them unnecessarily.

HTTP Implementation

As an alternative to the rel="compression-dictionary" HTML attribute, you can also use the Link response header to inform the browser that it may download a shared compression dictionary that applies to the requested page.

You can do so by assigning the relative or absolute path of the dictionary file, enclosed in angle brackets, along with the rel="compression-dictionary" parameter to the Link header:

HTTP/2 200
Date: Thu, 03 Oct 2024 14:18:44 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 51411
Connection: keep-alive
Server: Apache/2.4.41 (Ubuntu)
Content-Encoding: br
Cache-Control: max-age=2592000
Vary: Accept-Encoding
Link: </dict.dat>; rel="compression-dictionary"
tip

While any text-based file extension supported by the HTTP protocol can be used as a shared compression dictionary, it's a good practice to use a neutral file extension, such as .dat, for a dedicated dictionary to prevent the browser from misinterpreting it as a standard HTML file.

2. Send the Dedicated Dictionary (with match and match-dest)

If the browser decides to request the dedicated dictionary during idle time, you can send the file in the following way:

HTTP/2 200
Date: Thu, 03 Oct 2024 14:21:23 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 7654
Connection: keep-alive
Server: Apache/2.4.41 (Ubuntu)
Use-As-Dictionary: match="/blog/*"; match-dest="document"
Content-Encoding: br
Cache-Control: max-age=2592000
Vary: Accept-Encoding

The above match pattern applies to every resource in your /blog subdirectory.

The Use-As-Dictionary header also includes the optional match-dest parameter, which can further improve resource loading in browsers that support the destination property for HTTP requests (you can use the values of destination for match-dest too).

The match-dest="document" assignment means this dedicated dictionary can only be used for compressing HTML files, ensuring the client's decoder won't attempt to use it for other content types (e.g. images) within the same match path.

How to Create a Shared Compression Dictionary File?

As delta compression uses an earlier version of a resource as a shared compression dictionary, you only need to create a separate dictionary file if you want to use the technology to compress dynamic resources.

Use Brotli or Zstandard's Command Line Interface

You can generate dedicated dictionaries for both Brotli and Zstandard using their command line interface (CLI).

In both cases, you'll need to pass a list of files from which the Brotli or Zstandard CLI can extract the common strings and generate a dictionary file.

You can run Brotli's dictionary generator from Brotli's /research directory while you need to run Zstandard in training mode to create a dictionary.

Generate a Raw Dictionary File with Use-As-Dictionary's Web App

To generate a dedicated shared compression dictionary, you can also use the Dictionary Generator app on the aforementioned Use-As-Dictionary website.

This tool is remarkably easy to use. It allows you to generate a raw dictionary file from 2-100 URLs on the same domain. As it generates a non-encoded dictionary file in .dat format, you can use it for both Brotli and Zstandard compression.

tip

Use-As-Dictionary's Dictionary Generator sets the default dictionary size to 1024 KB. This might be way too much for your needs, resulting in a shared compression dictionary with many redundancies.

(For example, when I generated a dictionary for DebugBear's blog from eight single post URLs, the .dat file included the <head> section, header, and footer eight times.)

If you encounter a similar issue, reduce the dictionary size to a much lower value (e.g. I changed it to 20 KB) to avoid redundancies.

Create a Dedicated Dictionary Manually

If the files you want to create the dedicated dictionary for don't have a complex structure, you can also create a raw dictionary file manually.

In the screencast video below, you can see how I'm trying to create a dedicated dictionary file for DebugBear's blog:

Essentially, I'm replicating manually what Use-As-Dictionary's Dictionary Generator does programmatically:

  • open the rendered page of a sample URL (i.e. one of our blog posts)
  • find common strings
  • add them to the raw dictionary file

In the video above, I copy-pasted the <head> section and footer of a single post, which are identical or almost identical for each post. For a real-world shared compression dictionary for the blog, I would also look for other common strings (e.g. the top menu or logo) and remove post-specific data, such as the title, URL, and date, to make it more effective.

You might find a more professional way to create a dictionary manually (e.g. by copying your HTML templates). For the above video, my aim was just to show that creating a shared compression dictionary is really not a complicated process.

tip

Don't worry if a few post-specific strings remain in your dictionary (unless they include sensitive user data — see the next section), as the Brotli or Zstandard algorithm will simply ignore those strings.

Also don't worry if you miss a common string, as the algorithms still perform the same checks for common strings that they run when not using a shared compression dictionary.

Is It Safe to Use Shared Compression Dictionaries?

According to the currently available information, shared compression dictionaries specified by the Compression Dictionary Transport protocol (still in the drafting stage) are safe to use.

However, compression dictionaries had an earlier implementation called SDCH (Shared Compression Dictionary over HTTP), which was found to be a security vulnerability, so browser vendors stopped supporting it. For reference, see the following resources from the early 2010s:

warning

SDCH and shared compression dictionaries defined by IETF's Compression Dictionary Transport specification are two different things and shouldn't be confused.

The SDCH specification was deprecated in 2017 and should no longer be referenced.

The new Compression Dictionary Transport protocol is a more advanced technology and includes additional security measures that SDCH lacked. Plus, browsers' privacy features have significantly improved since the early 2010s.

Privacy and Security Improvements

The new protocol mandates that shared compression dictionaries must be transmitted over the network using the secure HTTPS protocol.

There's also a strict same-origin policy (see above) that requires you to either serve the dictionary from the same domain as the compressed resources or add an Access-Control-Allow-Origin header that points to your own domain for cross-origin (CORS) requests.

The new protocol also makes use of modern browsers' cache partitioning feature (a.k.a. storage partitioning), ensuring that your domain has its own partition in the user's browser cache that other websites can't access (e.g. you can't load the Bootstrap library from cache if it was cached by another domain, even though it consists of the same files). Cache partitioning prevents certain kinds of side-channel attacks and data leakage, including timing attacks, XS-Leaks (i.e. cross-site leaks), and COSI (i.e. Cross-Origin State Inference).

Privacy and Security Risks

Despite the improvements, you still need to pay attention when using a shared compression dictionary to mitigate security risks.

The main vulnerability is that shared compression dictionaries sent over the network might still be exploited, so ensure that your dictionaries don't include any sensitive data that might compromise users' privacy. This can also happen by accident, especially if you have generated the dictionary using an automated tool.

tip

Either revise your shared compression dictionaries manually before compressing and sending them over the network or generate them only from resources that don't include any private user data (e.g. blog posts on a typical blog).

Wrapping Up and Further Resources

Shared compression dictionaries can reduce page weight, speed up page load times, and improve web performance metrics in a significant way.

As the feature has already been implemented by Chrome and will also be available in other Chromium-based browsers soon, you can start experimenting with it on your website today. Since browsers that don't support shared compression dictionaries simply ignore the Use-As-Dictionary header and compression-dictionary link type, you don't need to worry about providing a fallback method.

You can check out some examples with performance data from real-world implementations on major websites, including YouTube, CNN, Yahoo, Etsy, Amazon, and more.

Here's a list of the current status of CDN support for the Compression Dictionary Transport protocol, too, including Amazon CloudFront, Cloudflare, and Fastly.

At DebugBear, we're really excited about the new feature. You can test how much shared compression dictionaries reduce page weight and improve web performance metrics on your own website using our synthetic lab tests and compare tool — you can get started for free. I recommend starting with a simpler delta compression and then gradually progressing to more complicated dynamic dictionaries.

Website monitoring illustration

Monitor Page Speed & Core Web Vitals

DebugBear monitoring includes:

  • In-depth Page Speed Reports
  • Automated Recommendations
  • Real User Analytics Data

Get a monthly email with page speed tips