Login With Github

Merging HTTP Requests vs Parallel HTTP Requests, Which is Faster?

When interviewing, I often ask the candidates the question: How to improve the performance of the webpage?

Some will mention the point: reduce/merge HTTP requests.

Then, I'll continue to ask: Isn't it possible to download resources in parallel in the browser? Why not download resources in parallel? Isn't combining multiple resources into one resource and then using only one HTTP request to download,faster than using  multiple HTTP requests to parallelly download multiple resources which have not been merged with?

Candidate: ... (I have't heard an answer that makes me satisfied by now)

Minimizing HTTP requests is the first one in the 35 Gold Laws for Yahoo's front-end performance optimization. Yahoo proposed the 35 Gold Laws in 2006. Since then, it has deeply affected thousands and hundreds of front-end developers. And even today, 12 years later, the influence is still not decreasing.

However, there is also another one in the 35 Gold Laws: split resources to maximize the ability to use browsers for parallel downloads. Now we have a problem: we need to minimize HTTP requests, but the resources required for the web page can't be reduced (otherwise the web page is no longer the previous correct web page), so minimizing HTTP requests is mainly achieved by merging resources. On the one hand, it is recommended to merge resources, while on the other hand, it is recommended to split resources. There is an obvious conflict. So what should we do?

I've found some articles which discuss this problem, but most writers are stuck in the theoretical analysis of the assumption, and ignore the impact of the TCP transmission mechanism. So, I'll explore the issue with a method which is both experimental and theoretical today.

HTTP Request Process

The main process of an HTTP request is:

DNS resolution (T1) -> Establish TCP connection (T2) -> Send request (T3) -> Wait for the server to return the first byte (TTFB) (T4) -> Receive data (T5).

As shown in the following figure, the HTTP request displayed in Chrome Devtools shows the main phase of the HTTP request. Note that the Queueing phase is the queuing time of the request in the browser queue and doesn't counted into the HTTP request time.

From the process, it can be seen that if you merge N HTTP requests into one request, the time it can save is (N-1)* (T1+T2+T3+T4).

However, the actual scenario isn't so ideal, and there are several vulnerabilities in the above analysis:

  1. The browser caches DNS information, so not every request needs DNS resolution.
  2. The HTTP 1.1 keep-alive feature allows HTTP requests to reuse existing TCP connections, so not every HTTP request needs to establish a new TCP connection.
  3. The browser can send multiple HTTP requests in parallel, which may also affect the download time of resources. The above analysis is obviously based on the scenario that there is only one HTTP request at the same time.

Experiment Argumentation

Let's do four sets of experiments, comparing the time it takes for an HTTP request to load a merged resource, and the time it takes to parallelly load a split resource by multiple HTTP requests. There is a significant difference in the size of the resources used in each set of experiment.

Experiment environment

Server: Cloud ECS, 1 core, 2GB memory, Bandwidth 1M

Web server: Nginx (Gzip not enabled)

Chrome v66 incognito mode, disable caching

Client network: wifi bandwidth 20M

Experiment code URL: https://github.com/xuchaobei/...

Experiment 1

Test files:large1.css,large2.css...large6.css (141K for each file);large-6in1.css (merged from the previous six css files, 846K). Thelarge1.css,large2.css...large6.css are referenced in theParallel-large.html, and thelarge-6in1.css is referenced in thecombined-large.html. The code is as follows:

parallel-large.html

<!DOCTYPE html>
<html>

  <head>
    <meta charset="utf-8" />
    <title>Parallel Large</title>
    <link rel="stylesheet" type="text/css" media="screen" href="large1.css" />
    <link rel="stylesheet" type="text/css" media="screen" href="large2.css" />
    <link rel="stylesheet" type="text/css" media="screen" href="large3.css" />
    <link rel="stylesheet" type="text/css" media="screen" href="large4.css" />
    <link rel="stylesheet" type="text/css" media="screen" href="large5.css" />
    <link rel="stylesheet" type="text/css" media="screen" href="large6.css" />
  </head>

  <body>
    Hello, world!
  </body>

</html>

combined-large.html

<!DOCTYPE html>
<html>

  <head>
    <meta charset="utf-8" />
    <title>Combined Large</title>
    <link rel="stylesheet" type="text/css" media="screen" href="large-6in1.css" />
  </head>

  <body>
    Hello, world!
  </body>

</html>

Refresh both of the pages 10 times, and use the Network of Devtools to calculate the average time of CSS resource loading.

Note:

  1. The load time oflarge1.css,large2.css...large6.cssis calculated from the time when the HTTP request of the first resource is sent, to the time when the download of the 6 files is completed, as shown in the red box in the Figure 2.
  2. The two html pages cannot be loaded at the same time, otherwise the bandwidth will be shared by two pages, which will affect the test results. You need to wait until one of the pages completes loading, and then refresh and load another page manually.
  3. The page refresh interval should be over 1 minute to avoid the impact of HTTP 1.1 connection multiplexing on the experiment. 

Figure 2

Figure 2

The experiment results are as follows:

  Large-6in1.css large1.css, large2.css ... large6.css
Average time (s) 5.52 5.3

Then let's mergelarge1.css,large2.css...large6.cssinto 3 resources,large-2in1a.css,large-2in1b.css,large-2in1c.css(each resource is 282K and the three resources will be referenced incombined-large-1.html:

combined-large-1.html

<!DOCTYPE html>
<html>

  <head>
    <meta charset="utf-8" />
    <title>Parallel Large 1</title>
    <link rel="stylesheet" type="text/css" media="screen" href="large-2in1a.css" />
    <link rel="stylesheet" type="text/css" media="screen" href="large-2in1b.css" />
    <link rel="stylesheet" type="text/css" media="screen" href="large-2in1c.css" />
  </head>

  <body>
    Hello, world!
  </body>

</html>

Test 10 times and get an average loading time of 5.20 s.

The total experiment results are as follows:

  Large-6in1.css large1.css, large2.css ... large6.css large-2in1a.css,... large-2in1c.css
Average time (s) 5.52 5.30 5.20

From the results of Experiment 1, it can be seen that the merged resource and the split resource have no significant effect on the total load time of the resource. The least time-consuming case in the experiment is the case of splitting into three resources (5.2s), and the most time-consuming was the case of merging into one resource (5.52s), but the difference between the two cases is only 6%. Considering that there is randomness in the experimental environment and the number of repeating the experiment is only 10, so the difference between the loading time doesn't indicate that the three scenarios have significant time differences.

Experiment 2

Increase the size of the css files.

Test files:xlarge1.css, xlarge2.css, xlarge3.css, (1.7M for each file);xlarge-3in1.css(merged from the previous three css files, 5.1M). Thexlarge1.css, xlarge2.css, andxlarge3.cssare referenced in theparallel-xlarge.html,and thexlarge-3in1.cssis referenced in thecombined-xlarge.html.

The test process of Experiment 2 is the same as Experiment 1, and the experiment results are as follows:

  Xlarge-3in1.css xlarge1.css, xlarge2.css, xlarge3.css
Average time (s) 37.72 36.88

The time difference is only 2%, which is even smaller, so it is even more difficult to illustrate whether there exists a significant difference between the total loading time of the merged resources and the split resources.

In fact, ideally, as the size of the resource becomes larger, the time required for loading the resource will tend to be the same.

Theoretically, the HTTP transmission channel is based on TCP connection, and the TCP connection has the characteristics of slow start, so the network bandwidth which is not fully utilized at the beginning will be occupied gradually after the slow start process. Bandwidth is always fully occupied by large resources, so the bandwidth becomes a bottleneck, and even with more TCP connections, the speed still can't be accelerated. The larger the resource is, the smaller the percentage of total download time for slow start is, so the bandwidth is utilized fully at the most of the time. Because the total amount of data is the same (additional headers caused by splitting resources can be fully ignored in this case), and the bandwidth is the same, of course the transmission time is the same too.

Experiment 3

Reduce the css file size.

Test files:medium1.css, medium2.css...medium6.css(9.4K for each file);medium-6in1.css(merged from the previous six css files, 56.4K). Themedium1.css, medium2.css...medium6.cssare referenced in theparallel-medium.html, and themedium-6in1.cssis referenced in thecombined-medium.html.

The experiment results are as follows:

  Medium-6in1.css medium1.css,med2.css ... medium6.css
Average time (ms) 34.87 46.24

Note that the unit changes to ms.

The time difference of Experiment 3 is 33%, that is only 12 ms. So let's continue to see the experiment 4.

Experiment 4

Continue to reduce the size of the css file to the tens of bytes.

Test files:small1.css, small2.css ... small6.css, (28B for each file);small-6in1.css(merged from the previous six css files, 173B). Thesmall1.css, small2.css ... small6.cssare referenced in theparallel-medium.html, and thesmall-6in1.cssis referenced in thecombined-medium.html.

The experiment results are as follows:

  Small-6in1.css small1.css,small2.css ... small6.css
Average time (ms) 20.33 35

The time difference of Experiment 4 was 72%.

According to Experiment 3 and Experiment 4, it can be found that when the size of the resource is small, there will be obvious differences between the loading time of the merged resource and the split resource. Figure 3 and Figure 4 are screenshots of the test results in Experiment 4. When the resource size is very small, the loading time of the data (shown in the blue part of the horizontal bar in the figure) will be just a small percentage of the total time. The key of affecting resource load time is DNS resolution (T1), establishing TCP connection (T2), sending request (T3), and waiting server for returning the first byte (TTFB) (T4).

However, when establishing multiple HTTP connections at the same time, there needs additional resource consumption. For each HTTP, there is randomness in the DNS query time and the establishment time of the TCP connection, so it will make a greater chance of increasing the time consumption of a certain HTTP significantly when requesting resources concurrently. As shown in Figure 3, thesmall1.csshas the shortest load time (16ms) and thesmall5.csshas the longest load time (32ms). In the case of the calculation time being based on the completion of loading all resources, making multiple HTTP requests at the same time will lead to greater time non-uniformity and uncertainty, which causes it often slower than making one HTTP request to load the merged resources.

Figure 3

Figure 4

More Complicated Scenes

Is it better to merge resources for small files?

Not always. In fact, in some cases, merging small files may significantly increase resource load time.

Theoretically, in order to improve the transmission efficiency, after sending a data packet on the TCP channel, not every time the sender will wait until it receives an ACK from the receiver, but it will send the next message directly instead. TCP introduces the concept of "window", which refers to the maximum value that can be used to continue to send data without waiting for an ACK. For example, the window size is 4 MSS (Maximum Segment Size, the maximum data segment that TCP packets can transmit at a time) , which indicates that four segments can be sent continuously without waiting for the acknowledgement signal from the receiver, that is, four segments are transmitted in one round-trip. As shown in the figure below (MSS is 1, the window size is 4), the data 1-4000 are sent continuously, without waiting for an ACK. Similarly, data 4001 - 8000 are also sent continuously. Please note that it is just a schematic diagram of the ideal scene, the actual scenes are more complicated.

During the slow start phase, TCP maintains the congestion window variable. The size of the window in this phase is equal to the size of the congestion window. However, in the slow start phase, the size of the congestion window doubles with each network round trip. For example, let's assume the initial size of the congestion window as 1, then the size of the congestion window will change to: 1, 2, 4, 8... , as shown below.

In the actual network, generally, the initial value of the congestion window is 10, so the size of the congestion window will change to: 10, 20, 40 ... . And the value of the MSS depends on the network topology and hardware devices. Generally, the MSS value in the Ethernet is 1460 Byte, and if the data size transmitted by each segment is equal to the MSS calculation (the actual situation can be less than the MSS value), the maximum data transmitted will be 14.6K after the first network round trip,  (10+20)*1.46 = 43.8K after the second time,  and (10+20+40)*1.46 = 102.2K after the third time.

According to the above theory, in Experiment 4, whether it is a merged resource or a split resource, the transfer is completed in one network round trip.  However, in Experiment 3, the resource size after being split is 9.4K, which can be completed in one network round trip, and the combined resource size is 56.4K, which requires 3 network round trips to complete the transmission. If the network delay is large (such as 1s) and the bandwidth isn't a bottleneck, then two more network round trips will lead to a time-consuming increase of 1s, so it may be not worth combining resources.

Summarize

For large resources, whether the resources are merged or not has no significant impact on the load time. Splitting resources can let us make better use of the browser cache, and an update to a certain resource won't cause all the resource caches to be invalidated. On the contrary, any resource update will cause the cache of the entire resource to be expired if the resources are merged. In addition, domain name fragmentation technology can be used to deploy split resources into different domain names, which can spread the pressure of the server and reduce the impact of network jitter.

For small resources, merging resources tends to have faster loading speed, but in the case of good network bandwidth conditions, because the unit of the increased time is measured in ms, the superiority can be ignored. If the network delay is large and the response speed of the server is slow, it can bring certain benefits. However, in a high-latency network scene, it is necessary to pay attention to the increased number of the network round trips after merging resources, as it will affect the load time.

In fact, choose to merge HTTP requests or parallel HTTP requests is not the key, but it's important to know how it works under the hood and what kind of the business scenes they are suitable for.

1 Comment

temp

Nice. Thanks sharing