Analyzing Factors That Influence End-to-End Web Performance

Balachander Krishnamurthy and Craig E. Wills
AT&T Labs--Research
180 Park Ave
Florham Park, NJ 07932 USA
bala@research.att.com
Computer Science Department
Worcester Polytechnic Institute
Worcester, MA 01609 USA
cew@cs.wpi.edu

Abstract:

Web performance impacts the popularity of a particular Web site or service as well as the load on the network, but there have been no publicly available end-to-end measurements that have focused on a large number of popular Web servers examining the components of delay or the effectiveness of the recent changes to the HTTP protocol. In this paper we report on an extensive study carried out from many client sites geographically distributed around the world to a collection of over 700 servers to which a majority of Web traffic is directed. Our results show that the HTTP/1.1 protocol, particularly with pipelining, is indeed an improvement over existing practice, but that servers serving a small number of objects or closing a persistent connection without explicit notification can reduce or eliminate any performance improvement. Similarly, use of caching and multi-server content distribution can also improve performance if done effectively.

Keywords: Web Performance, Web Protocols, End-to-End Performance, Active Measurement

1 Introduction

Several projects have studied factors that influence the performance of the Web using Web server logs, proxy logs and client traces. Significant effort has gone into improving the protocol on which the Web is based and the new version of HTTP, namely HTTP/1.1 has been recently upgraded to a draft standard by IETF [4]. However, to the best of our knowledge, no one has performed an end-to-end test of the Web in terms of the factors that influence performance as perceived by users when they visit various Web sites.

This work grows out of independent work by the authors for testing different aspects of Web performance. It is a natural follow-up PROCOW study [9], an original large scale study examining if the Web servers running at popular Web sites around the world claiming to run HTTP/1.1 were indeed compliant with the the HTTP/1.1 protocol specification. The PROCOW study examined the compliancy of the servers by sending valid HTTP/1.1 client requests from several places in the world. The major conclusion of the PROCOW study was that many sites were not fully compliant with the HTTP/1.1 protocol and some sites were turning off the new features in HTTP/1.1. The work described in this paper also follows up on the methodology of the work examining content reuse and server responses relevant for Web caching [19].

In this work we build on the PROCOW infrastructure and use nine client sites around the world to test how various factors affect the performance of the Web. Our study is based on the key set of changes between HTTP/1.0 and HTTP/1.1 [10] to test their impact on performance. We also examine their impact on performance in conjunction with other factors such as caching and distributed Web content.

Our study is significant in that it examines the performance of various protocol options by measuring end-to-end response for actual Web servers from a variety of client sites. This is a significantly broader study compared to that of Nielsen, et al [17], which measured the performance of HTTP/1.0 and HTTP/1.1 in a controlled setting for a single synthesized Web page. We gathered a set of active measurements by sending requests from various client sites around the world to over 700 popular sites to help quantify the benefits of HTTP/1.1 features in conjunction with other factors. These include persistent connections, persistent connections with pipelining, range requests, caching and responses from multiple servers for a single Web page.

The collection of Web servers tested are representative of the interesting set of Web servers since we attempted to include servers to which a significant portion of the request traffic is addressed. We do not claim our client sites are representative of all users, but they do include a variety of sites. We ensured that there is some degree of control in our experiment but given that the Web experience changes fairly often, our study mimics real life. In analyzing the results, we look for trends rather than absolute numbers. The fact that the results are relatively consistent over multiple clients and that we make some measurements from the same client/server pair over multiple days at the same time gives us some degree of belief in the repeatability of our experiments.

In the remainder of this paper we describe the factors studied in our work, followed by a discussion of the methodology we used in performing the study. The middle portion of the paper presents the results from our study on the test sets we use followed by a discussion on possible implications of these results for Web servers, caches and the HTTP protocol. The paper concludes with a description of related work followed by a summary and our own directions for future work.

2 Study

There are numerous factors involved in the end-to-end performance of retrieving a Web page over the Internet. In the following we describe the specific factors we study (and those we do not) in this work and our reasons for doing so. In addition there are many factors that we do not explicitly study, but consider in the testing and analysis of the various factors.

2.1 Factors Studied

The factors we explicitly study are:

2.2 Factors Considered

In studying these factors there are numerous other factors that must be accounted for in our testing and analysis. These factors are:

2.3 Factors Not Examined

There are many other factors influencing the end-to-end performance in retrieving a Web page. The following are some that we considered for our current study, but did not include.

3 Methodology

Our basic methodology is to make active measurements on the effect of different protocol options for a large number of client/server pairs at different times. The previous study on HTTP/1.1 performance improvements used an artificial environment for testing [17]. Other Web performance studies have used logs and packet traces to obtain timing information, but this approach does not allow a controlled set of requests to be issued. Our approach was to identify a set of client and server sites for our tests.

We came to an early conclusion that it would be difficult to have a set of representative client sites since information about the distribution of clients and their network connections is both hard to obtain and verify. Additionally, to obtain a fair sampling of clients around the world would involve significant effort. We chose to sample from a set of client sites where we had professional connections. The client sites used in our study along with their location and network setup are

  1. AT&T Research Labs, NJ USA, with multiple T-1 connections to the Internet.

  2. A commercial site in Santiago, Chile (10 Mbps via fiber to Telefonica Net which has a slower link to the Internet and links to Cable&Wireless and Alternet via two hops of 45 Mbps ATM link.

  3. Hewlett-Packard Labs, Palo Alto, CA, USA connects to the public Internet via one of four major ISP's depending upon the traffic's destination, each ISP being connected with one or more T-3 circuits.

  4. ACIRI: AT&T Center for Internet Research at ICSI, Berkeley, CA, USA (10 Mbps link from ACIRI to UCB, then UCB to Internet over Calren).

  5. University of Kentucky, Lexington, Kentucky, USA connected via UUNET over a DS3 (45 Mbps link).

  6. A site belonging to a academic network UNINETT AS, in Trondheim, Norway (10 Mbps link to UNINETT-GW and increasing speed links to the Internet via NORDUnet).

  7. University of Western Australia, Nedlands, Western Australia.

  8. A private site in Cape Town, South Africa with a 64K Digital connection, similar to a US-standard 56K connection, to the Internet.

  9. Worcester Polytechnic Institute, Worcester, MA USA with multiple T-1 links to the Internet.

The choice of servers is a bit easier since there is a necessary concurrence in the notion of ``popular'' sites. Advertisers depend on this information and sites would attempt to demonstrate that studies significantly lowering their standing in any ranking are incorrect. Accordingly, we went with a subset of the collection done for the PROCOW study [9], which used a combination of collection techniques. Briefly, it merged a combination of recognized rating sites (MediaMetrix [13], Netcraft [16], Hot100 [18]) and a set of sites that are likely to be popular given their business prominence (Fortune 500 [5] and Global 500 [7]). The end result was a list of 711 popular server sites for which we retrieved the home page and all embedded objects. In creating the list, we did not try to distinguish whether a server site was supporting HTTP/1.0 or HTTP/1.1.

The basic engine for making all retrievals in our study is httperf [15]. We obtained a publicly available copy of the software and modified it slightly to print out additional information needed for our study. The httperf software is attractive because it allows a set of objects to be retrieved from a server using the variety of 1.0 and 1.1 protocol options of interest to our study. The native software collects and prints out a number of statistics about the status and performance of retrieving the set of objects. Of particular interest to our study is that it records the number of server connections made and for each retrieved object, the time the request was sent, the time receipt of the response began and the time the complete response was received. For small objects the last two times may be the same.

The algorithm in Figure 1 describes the method used for a single test between a client and a server. It exploits features of httperf and overcomes certain limitations of httperf for the purposes of our study. The two primary limitations that we had to overcome was that httperf does not parse HTML code to retrieve embedded images and a single run of httperf could communicate with only one server. However, we exploited httperf's feature of retrieving a fixed set of URLs from a server using serial requests over separate connections, parallel requests over separate connections, serial requests over a persistent connection and pipelined requests over a persistent connection. The resulting algorithm starting with a base URL for a given server is shown in Figure 1.

1. Use httperf to retrieve the base URL from the server and store the results.
2. Parse the base URL code to determine all unique embedded objects.
3. Separate the base and embedded objects according to their server.
4. For each server containing needed objects {
    5. (serial-1.0) Use httperf to retrieve all objects using 
       serialized HTTP/1.0 requests.
    6. (burst-1.0) Use httperf to retrieve all objects using 
       up to four parallel HTTP/1.0 requests.
    7. (serial-1.1) Use httperf to retrieve all objects using 
       serialized requests over an HTTP/1.1 persistent connection.
    8. (burst-1.1) Use httperf to retrieve all objects using 
       pipelined requests over an HTTP/1.1 persistent connection.
}
Figure 1: Basic Algorithm to Test a Server from a Client

There are a number points to note about the algorithm. The initial retrieval in Step 1 is used to determine the set of objects to fetch. If all embedded objects are from the same server as the base URL then there will be only one server list in Step 3. The server list for the base server includes the base object. The burstiness of parallel connections in Step 6 and pipelined connections in Step 8 does not begin until the first object in the list is retrieved. Step 1 retrieves and stores the object contents. Steps 5-8 retrieve, but do not store object contents. Steps 5-8 are used for all performance measurements.

We used four parallel connections in the burst-1.0 method since that appears to be the default in the popular browsers (Netscape and Internet Explorer). With each additional parallel connection, there is a necessary load on the client and the server (more on the server since it has to have free TCP slots to handle several client connections in parallel).

This basic test is used to compare performance of each protocol option for a specific client and server. While the comparison of response times for an individual test may not be meaningful due to variation in network and server load, we use these individual tests as building blocks for measuring the relative performance of the protocol options over a large number of tests.

We used this basic test over two sets of test data. All tests were made in November, 1999. The first set of test data consists of the 711 previously identified servers. Tests of each of these servers were run once from each of the client sites. The tests last several hours so each client/server test may be run under different network conditions, but the four protocol options of each test are run under approximately the same conditions.

In addition, we ran the test in a more controlled setting from the AT&T, Chile and WPI client sites. For these tests we selected 72 server sites from a list of 200 sites identified on the current MediaMetrix, Netcraft and Hot100 sites. The selected sites were chosen because they supported pipelining and persistent connections in a test run on these sites. The controlled tests were run on the same fixed 6 hour intervals from each site for a week. This controlled test was designed to study caching and time of day factors.

4 Results

This section provides results to address the factors for study that were identified in Section 2.

4.1 Test Sets

The initial part of our results is analyzing the base set of statistics for our client/server test sets, and is shown in Table 1 for client/server pairs. The clients are our test sites, the servers are either PROCOW indicating we tested the 711 servers from the earlier PROCOW study or ``select'' indicating we tested with 72 servers repeated every six hours.

Client/
Server Set
Successful
Retrieval of
Base URL
Servers with
Successful
Object Retrieval
Multiple
Object
Servers
Perfect Conn.
Persistence
Servers (%)
Imperfect Conn.
Persistence
Servers (%)
att/procow 670 855 674 167 (25%) 121 (18%)
aciri/procow 673 858 667 223 (33%) 73 (11%)
aust/procow 667 854 664 201 (30%) 56 (8%)
chile/procow 674 862 671 200 (30%) 74 (11%)
hp/procow 665 854 662 201 (30%) 73 (11%)
uky/procow 645 824 635 196 (31%) 84 (13%)
norway/procow 668 856 662 128 (19%) 38 (6%)
safrica/procow 663 848 654 194 (30%) 92 (14%)
wpi/procow 657 834 662 192 (29%) 66 (10%)
att/select 1515 2588 1975 858 (43%) 206 (10%)
chile/select 1873 3161 2423 910 (38%) 288 (12%)
wpi/select 1897 3223 2456 1049 (43%) 274 (11%)
Table 1: Test Sets

Focusing on the PROCOW test set, the second column indicates the number of servers out of the 711 that returned an HTTP 200 response code (success) when the base URL was retrieved. Those not returning this value either returned 302 (redirection), 404 (not found), or the client timed out. Once the base URL is retrieved, all objects contained on this page were retrieved. As shown in Figure 1 multiple servers may be accessed to retrieve all objects. The third column of Table 1 shows a count of these servers that successfully returned all objects. The fourth column shows the number of these servers that return more than one object. We focus on these servers because persistent connections will not have an effect on client access time if only one object is retrieved. The last two columns in Table 1 show the number and percentage of multiple object servers exhibiting persistent connections.

The last two columns need further explanation: one focus of our study is to compare the performance of the four protocol options. Thus for a client/server test, we only want to consider cases where all objects needed from that server are successfully retrieved for all protocol options. If one or more of the four protocol option tests retrieves fewer than all objects then we discount that test (all four protocol options) for further study. In addition to the number of objects retrieved we also examine the number of TCP connections that are used. If an HTTP/1.1 test used as many TCP connections as objects then all objects have been retrieved, but not with any persistent connections. These test cases are also eliminated from further study. The remaining tests are classified as showing some connection persistence. These tests retrieve all objects for all protocol options and use fewer TCP connections than objects needed for both 1.1 options. Of this category, we classify tests using only one TCP connection for both HTTP/1.1 tests as ``perfect'' indicating all objects are retrieved in a single persistent connection. We characterize tests exhibiting some connection persistence, but not perfect connection persistence, as ``imperfect'' meaning that one or both of the 1.1 options used more than one TCP connection.

The last three rows of Table 1 show the same statistics for the selected set of servers tested periodically from three client sites. The last two columns indicate that these tests are a bit better in exhibiting a higher percentage of persistence, but the percentages are not as high as we would expect from a ``select'' group. These reduced numbers come from the inclusion of servers not exhibiting persistence in our initial tests due to an early error (subsequently fixed) in one of our analysis scripts. In addition, just because the base server supports persistence, it is possible that other servers providing its embedded objects may not. Again only tests exhibiting persistence are considered for further analysis.

As a final point on summarizing the test sets we note that the Nielsen, et al, paper on HTTP/1.1 also described two other performance improvements--cascading style sheets (CSS) and portable network graphics (PNG). As an interesting sidelight to our study we examined penetration of these improvements to the set of PROCOW servers. Examination of the AT&T results (typical for other results) showed 82 (12%) of the 670 base URLs using style sheets. A similar examination for PNG images found zero usage.

4.2 Protocol Options

In examining the various issues influencing end-to-end Web performance we first examined the impact of the four protocol options described in Figure 1. For this analysis we only consider the client/server tests exhibiting persistence. Results for the four protocol options from three of the client sites using the PROCOW test set are shown in Table 2 for servers that exhibit perfect connection persistence in the retrieval of objects. Results from the remaining client sites are shown in Table 8 at the end of the paper.

Client
Site
Object
Count
Range
Range
Pct.
Pct.
Persistent
for Range
Ave. Retrieval Time (Sec.) (Ratio with Burst-1.0)
Serial-1.0 Burst-1.0 Serial-1.1 Burst-1.1
att 2-5 26% 49% 1.83 (1.1) 1.65 (1.0) 1.68 (1.0) 1.73 (1.0)
att 6-15 22% 26% 2.96 (1.3) 2.28 (1.0) 1.74 (0.8) 1.40 (0.6)
att 16+ 28% 3% 3.71 (1.4) 2.70 (1.0) 1.72 (0.6) 1.21 (0.4)
att Multi 76% 25% 2.25 (1.2) 1.88 (1.0) 1.70 (0.9) 1.61 (0.9)
chile 2-5 26% 48% 9.23 (1.5) 6.27 (1.0) 6.45 (1.0) 6.24 (1.0)
chile 6-15 22% 25% 23.45 (2.0) 11.73 (1.0) 12.08 (1.0) 9.13 (0.8)
chile 16+ 28% 18% 45.11 (2.9) 15.52 (1.0) 23.51 (1.5) 13.06 (0.8)
chile Multi 76% 30% 20.47 (2.1) 9.59 (1.0) 11.53 (1.2) 8.42 (0.9)
wpi 2-5 26% 46% 5.24 (1.1) 4.90 (1.0) 3.69 (0.8) 3.60 (0.7)
wpi 6-15 21% 25% 14.23 (1.7) 8.15 (1.0) 7.49 (0.9) 5.87 (0.7)
wpi 16+ 28% 18% 26.87 (2.1) 12.87 (1.0) 14.28 (1.1) 8.20 (0.6)
wpi Multi 75% 30% 12.24 (1.6) 7.46 (1.0) 6.97 (0.9) 5.18 (0.7)
Table 2: Servers Exhibiting Perfect Connection Persistence from Three Client Sites

Table 2 shows four lines of results for each client site. For each client, the first three lines are classifications based on the number of objects to be retrieved while the fourth line is a summary for all multi-object server tests exhibiting perfect connection persistence. The categorization is introduced to examine variations that occur due to the number of objects retrieved. The ranges of 2-5, 6-15 and 16+ are intended to reflect a small, medium and large number of objects for retrieval. The second column in the table reflects the relative percentage of servers with the given number of objects relative to the total count of servers (column 3 in Table 1). The fourth column in the table indicates the percentage of servers exhibiting perfect connection persistence among all servers with the given range of objects. For example, 49% of servers with 2-5 objects showed perfect connection persistence when tested from the AT&T client. The fact that only 3% of the servers with 16+ objects exhibited perfect connection persistence from the AT&T client is out of line with the performance of all other client sites in Tables 2 and 8. We do not have a clear explanation for this behavior, but do note that the percentage of the servers with 16+ objects exhibiting imperfect connection persistence from the AT&T client in Table 4 is actually higher than for other client sites.

The last four columns show the average retrieval time for each of the four protocol options in seconds. The number in parentheses is the ratio of the given time to the time for the burst-1.0 option. The ratio is intended to show the relative performance of each option relative to common HTTP/1.0 usage. Again illustrating with an example, the burst-1.1 option (pipelining with persistence) for 6-15 objects from the AT&T client took on average 1.40 seconds to retrieve. This time is approximately 60% (0.6) of the time taken to retrieve the same objects using burst-1.0.

Table 3 shows an alternate approach for presenting the relative performance of the four protocol options for perfect connection persistence servers. The table shows the relative variation in the results shown in Table 2. The retrieval times for each protocol option from a client site are compared against the time for the burst-1.0 option. If the absolute value of the difference is less than one second then the relative performance of these two options is considered the "same". If the protocol option exhibits better than one second performance improvement then this option is classified as "better" than burst-1.0 and if its performance is more than one second worse then it is classified as "worse". Table 3 shows the percentages of servers that are classified as better, the same and worse for each protocol option from the three client sites in Table 2. The results are consistent with the ratios given in Table 2, except they reduce the significance of differences for relatively well-connected clients such as AT&T where the retrieval times for all protocol options are relatively small.

Client
Site
Object
Count
Range
Better/Same/Worse% Performance Relative to Burst-1.0
Serial-1.0 Burst-1.0 Serial-1.1 Burst-1.1
att 2-5 15/75/9% 0/100/0% 15/76/8% 15/74/11%
att 6-15 10/42/48% 0/100/0% 16/84/0% 22/70/8%
att 16+ 29/29/43% 0/100/0% 43/29/29% 57/43/0%
att Multi 14/63/22% 0/100/0% 17/77/7% 19/71/10%
chile 2-5 12/18/70% 0/100/0% 15/58/28% 23/59/18%
chile 6-15 6/0/94% 0/100/0% 32/21/47% 62/21/17%
chile 16+ 0/0/100% 0/100/0% 7/0/93% 73/2/25%
chile Multi 8/10/82% 0/100/0% 17/37/47% 43/38/20%
wpi 2-5 23/46/31% 0/100/0% 31/50/19% 33/54/13%
wpi 6-15 15/7/78% 0/100/0% 28/41/30% 41/48/11%
wpi 16+ 12/2/86% 0/100/0% 30/21/49% 53/23/23%
wpi Multi 19/27/55% 0/100/0% 30/41/29% 40/46/15%
Table 3: Variation in Performance for Servers Exhibiting Perfect Connection Persistence from Three Client Sites

Overall, the burst-1.1 option generally exhibits the best performance with the burst-1.0 and serial-1.1 options in the middle and the serial-1.0 option exhibiting the worst performance. These results are as expected and consistent with those presented in [17]. However, a number of other results come out of examination of the results in Tables 2, 3 and 8:

  1. The percentage of servers that exhibit perfect connection persistence goes down as the number of objects retrieved increases.

  2. The relative performance of burst-1.1 (compared to burst-1.0) improves as the number of objects increases. Thus pipelining and persistence improves relative performance with more objects, but also causes more problems in correctly obtaining all objects with a single connection.

  3. The percentage of servers that support perfect connection persistence is relatively low. Looking at the ``Multi'' row for each client site (or the fifth column in Table 1), we see the range to be 25-31% of servers. Variations occur because tests were run at different times from different clients under different network and server conditions.

    To better understand this result we retested and analyzed results from the WPI client with the PROCOW test set. We found that all objects were successfully retrieved from multiple object servers in 99% and 98% of the cases for the serial-1.0 and burst-1.0 options. However, only in 29% of the cases did we find that objects were successfully retrieved from these servers with only one TCP connection using the the burst-1.1 option. We found that objects were successfully retrieved for 40% of the cases for the serial-1.1 option. These results confirm that the failure of the burst-1.1 option to use only one connection is largely responsible for the small percentage of servers classified as perfect connection persistence servers.

    In looking for reasons that persistence was not present, we found that in 36% of the cases for the burst-1.1 option, the server either reported it was using HTTP/1.0 or explicitly included Connection: close in one of its response headers. In 23% of cases for this option, the server did not exhibit any persistence nor was there any reason given based on the server response. These cases indicate that the TCP connection was closed or reset without explicit warning. The two figures were 24% and 11% for the serial-1.1 option.

Table 4 shows results from three client sites for servers that exhibit imperfect connection persistence. Results for the additional client sites are shown in Table 9 at the end of the paper. Servers are classified as imperfect from a client site when the number of needed TCP connections is more than one, but fewer than the number of retrieved objects, for at least one of the 1.1 options. The results show that the relative performance of the serial-1.1 and burst-1.1 options is worse than the perfect connection persistent sites. These results indicate that the reconnection costs for dropped or lost connections impact the overall performance to the point that imperfect persistence servers generally exhibit worse performance with the 1.1 options than with the burst-1.0 option.

Client
Site
Object
Count
Range
Range
Pct.
Pct.
Persistent
for Range
Ave. Retrieval Time (Sec.) (Ratio with Burst-1.0)
Serial-1.0 Burst-1.0 Serial-1.1 Burst-1.1
att 2-5 26% 5% 7.57 (3.7) 2.05 (1.0) 2.75 (1.3) 4.89 (2.4)
att 6-15 22% 15% 4.78 (1.5) 3.15 (1.0) 3.03 (1.0) 2.61 (0.8)
att 16+ 28% 33% 8.87 (2.0) 4.51 (1.0) 4.36 (1.0) 2.48 (0.6)
att Multi 76% 18% 7.73 (2.0) 3.93 (1.0) 3.87 (1.0) 2.75 (0.7)
chile 2-5 26% 8% 17.82 (2.1) 8.69 (1.0) 10.92 (1.3) 12.25 (1.4)
chile 6-15 22% 15% 32.14 (2.0) 16.36 (1.0) 20.27 (1.2) 21.86 (1.3)
chile 16+ 28% 12% 60.51 (2.6) 23.57 (1.0) 33.33 (1.4) 34.02 (1.4)
chile Multi 76% 11% 39.39 (2.3) 17.22 (1.0) 22.94 (1.3) 24.13 (1.4)
wpi 2-5 26% 7% 5.77 (1.3) 4.28 (1.0) 7.32 (1.7) 6.44 (1.5)
wpi 6-15 21% 14% 19.14 (2.1) 9.04 (1.0) 11.69 (1.3) 12.64 (1.4)
wpi 16+ 28% 11% 33.11 (2.6) 12.63 (1.0) 18.01 (1.4) 18.98 (1.5)
wpi Multi 75% 10% 21.60 (2.3) 9.37 (1.0) 13.19 (1.4) 13.73 (1.5)
Table 4: Servers Exhibiting Imperfect Connection Persistence from Three Client Sites

4.3 Time of Day Analysis

The previous analysis used results from retrievals by a client to a large number of servers. The various protocol options were tested at approximately the same time for each client/server pair, but there was no control when these tests were run. To have more control on when tests were run we created the smaller, select set of servers and created a script to test each server at precise six hour intervals for one week. This script was run from the AT&T, Chile and WPI client sites. A test for the first server in the list was started at 0:02, 6:02, 12:02 and 18:02GMT each day. Tests for subsequent servers in the list were started at three minute intervals for an approximate testing period of three and one-half hours for a single round of testing. Results for each round from each of the three client sites are shown in Table 5, which is of similar format to Tables 2 and 4, but includes average object and byte count. Note that the results shown are only for servers exhibiting perfect connection persistence during at least one of the seven days in each time period.

Client
Site
Time
Range
(GMT)
Pct.
Persistent
Ave.
Obj.
Cnt.
Ave.
Obj.
Bytes
Ave. Retrieval Time (Sec.) (Ratio with Burst-1.0)
Serial-1.0 Burst-1.0 Serial-1.1 Burst-1.1
att 00:00-03:30 42% 7.4 32248 1.85 (1.4) 1.30 (1.0) 2.06 (1.6) 1.74 (1.3)
att 06:00-09:30 42% 7.7 33093 1.60 (1.4) 1.18 (1.0) 2.22 (1.9) 1.68 (1.4)
att 12:00-15:30 43% 7.5 32886 3.47 (1.6) 2.22 (1.0) 2.51 (1.1) 2.14 (1.0)
att 18:00-21:30 42% 7.3 31754 3.81 (1.6) 2.31 (1.0) 2.64 (1.1) 2.32 (1.0)
chile 00:00-03:30 32% 9.0 33004 30.80 (1.9) 16.59 (1.0) 17.13 (1.0) 12.98 (0.8)
chile 06:00-09:30 40% 9.1 35322 19.60 (2.0) 9.67 (1.0) 10.80 (1.1) 7.00 (0.7)
chile 12:00-15:30 38% 9.3 35225 25.17 (1.9) 13.11 (1.0) 14.09 (1.1) 9.46 (0.7)
chile 18:00-21:30 35% 9.1 34658 30.09 (1.9) 16.25 (1.0) 17.69 (1.1) 12.86 (0.8)
wpi 00:00-03:30 41% 8.7 33963 16.11 (1.8) 8.76 (1.0) 8.71 (1.0) 6.25 (0.7)
wpi 06:00-09:30 42% 9.0 34889 12.70 (1.9) 6.54 (1.0) 7.09 (1.1) 5.05 (0.8)
wpi 12:00-15:30 43% 9.4 36526 9.28 (1.8) 5.20 (1.0) 5.73 (1.1) 4.04 (0.8)
wpi 18:00-21:30 40% 9.6 37143 22.04 (1.9) 11.56 (1.0) 11.04 (1.0) 8.33 (0.7)
Table 5: Servers Exhibiting Perfect Connection Persistence Tested at Different Times of Day

The results show that average performance is generally best for the 6:00-9:30GMT time period (1:00-4:30 on the east coast of the U.S.). WPI results show that the 12:00-15:30 time period is best. Performance is generally the worst for the 18:00-21:30 time period (13:00-16:30 EST). These variations are expected with an approximate ratio of two between the worst and best time periods for a protocol option. Of more interest to our study are the variations in relative performance of the four protocol options. The results show little variation in the relative performance of the options as network/server activity varies other than results from the AT&T client, which shows relatively fast access so small variations have a larger effect on the ratio.

4.4 Caching

We again used the select set of servers to analyze the end-to-end performance effects of caching and restricted our analysis to those server tests exhibiting perfect connection persistence. The performance of each of the protocol options is shown in the first row for each client in Table 6. The relative performance of the four protocol options are relatively the same as found in the PROCOW test set for the given number of objects.

Client Site Cache Use Ave. Retrieval Time (Sec.) (Ratio with Burst-1.0) Retrieved Objects Retrieved Bytes
Serial-1.0 Burst-1.0 Serial-1.1 Burst-1.1
att no cache 3.05 (1.7) 1.83 (1.0) 2.39 (1.3) 1.91 (1.0) 7.6 32405
att with cache 0.51 (1.9) 0.27 (1.0) 0.27 (1.0) 0.28 (1.0) 0.5 6550
att validate cache 2.44 (1.8) 1.37 (1.0) 1.23 (0.9) 0.74 (0.5)
chile no cache 25.88 (1.9) 13.58 (1.0) 14.51 (1.1) 10.32 (0.8) 8.9 33810
chile with cache 2.91 (1.3) 2.24 (1.0) 2.11 (0.9) 1.63 (0.7) 0.5 6210
chile validate cache 19.14 (2.0) 9.41 (1.0) 9.92 (1.1) 5.13 (0.5)
wpi no cache 13.95 (1.8) 7.54 (1.0) 7.55 (1.0) 5.45 (0.7) 8.9 34873
wpi with cache 1.35 (1.1) 1.24 (1.0) 1.00 (0.8) 0.80 (0.6) 0.5 6704
wpi validate cache 10.57 (1.9) 5.65 (1.0) 4.97 (0.9) 2.57 (0.5)
Table 6: Caching Impact for Servers Exhibiting Perfect Connection Persistence

Of interest is the second row in the table for each client. This row predicts the performance results if a client cache was used. Because the select data set has a number of tests between the same client and server we can determine when an object from that server has been previously retrieved. For this study, we assume the cached contents can be reused if the size of the object has not changed. While this assumption is not always valid it is sufficient for the scope of this analysis. The number of objects and bytes retrieved in the presence of a cache is significantly reduced from the results in our test. The results include all of the initial cache misses. These results indicate that the set of Web pages in the select set were relatively static over the week of our study. The performance for each of the protocol options were not measured directly, but derived from the test data with an assumption of zero time for a cache hit. The results show that such a high cache reuse percentage leads to much better performance for all protocol options. As expected, caching has the most relative impact for the slowest serial-1.0 option.

The last row for each client in Table 6 again shows derived costs if each cache hit also incurred a validation cost where the client must send a GET If-Modified-Since or a GET If-None-Match request to the server and receive a 304 response before reusing the cache content. While this assumption is unrealistic, it examines the impact of validation requests on end-to-end performance. Our measured results for an object retrieval differentiate between when the first byte of response is received and when all bytes are received. For our derivation we used the time when the first byte is received as an approximation to the time for a header-only 304 response. The results show that the derived validation costs significantly increase the overall time, particularly for the serial-1.0 option. However, the relative performance of serial-1.1 improves relative to the other options.

In summary, the impact of caching is to reduce the number of objects retrieved. We can use the results in Tables 2, 3, 4, 8 and 9 to see that as the number of objects is reduced there is less relative difference between the protocol options. As the tables show, this reduction is not uniform so caching will yield the most cost reduction for serial-1.0 and the least for burst-1.1. The results also show that validation costs can be significant, particularly when connection and request times are in the critical path. As one measure of the validity of our assumption for deriving validation costs, we also issued a ``GET If-None-Match: *'' request to servers from the AT&T client. The ratio between average If-None-Match and full retrieval times for an object was 0.6. Hence the derived costs for validation shown in Table 6 do not appear unreasonable, but further testing is warranted.

4.5 Multi-Server Content

In analyzing the end-to-end performance impact of embedded objects served by servers other than the base server we use the PROCOW set of servers. The first part of our analysis is to determine the extent to which content is being served from multiple servers. These results are shown in Table 7 where the second column shows the number of cases where the base object includes embedded objects by servers other than the base server. These numbers are relative to the count of base servers in column 2 of Table 1. For example, the HP site shows that 99 out of 665 (15%) base URLs use more than one server to serve content.

Client Site Multi-
Server
Cnt.
Local Servers Ad Servers Akamai Servers Other Servers
Cnt. Obj.
Pct.
Byte
Pct.
Cnt. Obj.
Pct.
Byte
Pct.
Cnt. Obj.
Pct.
Byte
Pct.
Cnt. Obj.
Pct.
Byte
Pct.
aciri 97 31 42% 25% 34 11% 3% 10 55% 33% 42 14% 9%
aust 95 32 49% 29% 34 12% 5% 7 54% 40% 38 12% 8%
chile 103 33 39% 20% 34 11% 4% 7 63% 40% 45 13% 7%
hp 99 31 45% 24% 38 11% 4% 11 59% 33% 40 12% 7%
uky 90 27 39% 19% 36 9% 4% 8 69% 43% 39 14% 10%
norway 96 30 44% 27% 36 11% 4% 11 60% 37% 37 16% 9%
safrica 92 28 55% 31% 37 11% 3% 9 53% 38% 39 13% 8%
wpi 92 28 42% 23% 33 12% 7% 8 63% 44% 37 21% 12%
Table 7: Use of Multi-Server Content

We label content servers separate from the base servers as auxiliary. To further explore these results, we classified each of the auxiliary servers used. We did this classification through a combination of looking at the auxiliary server's name and IP address (results from the AT&T client are not shown because of problems due to masking when the IP address is printed). If the network portion of auxiliary server IP address matched the network portion of the base server's address it was classified as a local server (i.e., local to the base server, not the client). We realize that such a classification may not always be correct but we were able to verify by hand most of the sample. Non-local servers containing the string ``ad'' in the server name (such as ``adforce'') were classified as ad servers. Non-local servers containing the string ``akamai'' were another category. Akamai [1] is a commercial company, which serves content for contracted sites. All other non-local servers were grouped in the final category as Other. The purpose of these other sites is unknown--some may also serve ads or contracted content.

Results in Table 7 show the count of non-local servers in each category as well as the percentage of objects and bytes served by this category relative to the total number of objects and bytes for the base Web page. The percentages only include Web pages where the given category of servers are present. All cases show that auxiliary servers serve relatively more objects than bytes.

We also examined the impact of these categories on end-to-end performance. Because the base server and its auxiliary servers serve different numbers of objects and bytes it is not possible to compare the response times directly. Rather we determined the rate at which objects and bytes are served from each category of server using the best case time of the supported protocol options for each server. We found that local servers in the same network as the base servers have a higher object rate but a lower byte rate. Ad servers are mixed on object rate and lower on byte rate. Some of the servers that we categorized as other could be ad servers as well. Their byte rate is often less than the base servers.

The results also show that content distribution servers (which happen to be only Akamai servers in our test set) almost always show improved data rates relative to the base server. The relative object rate for the Akamai servers ranged from 2.0 for the Chile client to 20.0 for the HP client. The relative byte rate for the Akamai servers ranged from 0.7 for the Chile client to 7.5 for the Australia client. To further investigate these results we exploited the mechanism used to name Akamai-served objects where the object name includes the base server URL for these objects. This naming scheme allowed us to design a test where we retrieved from each client site the same set of objects from an Akamai server and their original base server. In this test the Akamai servers always yielded relatively better data rates. The relative object rate ranged from 1.9 for Chile to 15.2 for AT&T. The relative byte rate ranged from 2.1 for South Africa to 15.6 for AT&T.

Translating the impact of these improved data rates to end-to-end response time is more difficult and dependent on the strategy used by a server in distributing objects and a client in retrieving these objects from multiple servers. If a client retrieves objects from one server at a time then improvements in data rates for the remote content servers might be mitigated by increased costs to establish new connections with these servers, particularly if the client already has a pipelined, persistent connection with the base server. If on the other hand, the client retrieves objects in parallel from all servers then total response time will be controlled by the server spending the most time serving content. Results in Table 7 show that the base server is still serving most of the bytes, if not the objects, when multiple servers are used. In the case of most of the content coming from the base server then offloaded content does reduce response time, but then the importance between distributing content to a faster server versus simply a different server is less clear.

In summary, the performance impact of multi-server content is dependent on how multiple servers are used (for example performance is not a key consideration for ad servers) and how multi-server content is retrieved by clients. More study is needed on this issue, but it is important to consider that the use of multi-server content is still relatively small at the time of this study with it being used by 15% of base URLs, including less than 1.5% using it for remote content distribution.

4.6 Range Requests

We examine a subset of requests sent to HTTP/1.1 servers that were able to handle Range requests to see if there was appreciable reduction in latency in getting just the first hundred bytes and the first thousand bytes of an object as compared to a full retrieval. For the first hundred bytes test, among the servers that responded correctly with the 206 Partial Content response, the average latency for a Range response was 60% that of a full response for the AT&T Labs data, though the average full response time was only half a second. The numbers for South Africa are 53% with average full response time of 3.84 seconds, Norway 65% with average response time 0.72 seconds. For the thousand bytes range test the numbers are similar (AT&T 59% with the average full response time of 0.71 seconds, South Africa 57% with 3.84 seconds average, Norway 59% with 1 second average). So it is clear that there is improvement in user perceived latency but the appreciable nature of improvement depends on the speed of the link from a client site and its location in the Internet relative to server sites.

5 Discussion

The results from this study lead to a number of interesting observations about the factors that influence end-to-end Web performance. Focusing on the protocol option used, the results show that the best end-to-end performance is obtained when servers support persistent connections with pipelining (burst-1.1 option) and the connection persists over the lifetime of object retrieval from the server. The improved performance for this protocol option grows relative to other protocol options as more objects need to be retrieved from a server. The amount of this improvement is relatively constant over different times-of-day with different network and server conditions.

However there are issues with this expected result that dampen its effect. First, the likelihood that a server is able to support pipelining over a single persistent connection for the lifetime of object retrieval decreases as the number of objects increases. Second, the performance benefit of persistent connections is generally lost relative to the parallel HTTP/1.0 (burst-1.0) option if the connection is reset by the server and a new one (or ones) must be reestablished by the client. These results are highlighted by differences between perfect connection persistence results in Tables 2, 3 and 8 and imperfect connection persistence results in Tables 4 and 9. Finally, the potential performance benefits of the burst-1.1 policy are only available for about 50% of the servers we tested, as approximately 25% of servers served only one object and another 25% served a small number of objects. End-to-end performance for these servers will differ little based on what protocol option is used.

Our results also show that the interactions between various factors are important in end-to-end performance. The relative impact of caching for a client will vary according to the protocol option being employed by the client. Caching without the need for validation of the contents with the server can significantly improve all options, particularly the serial-1.0 option . However if validation is needed then the cost of each TCP connection is significant and the burst-1.1 option performs even better than the other options.

Interactions are also important in measuring the impact of multi-server content on end-to-end performance. If a client already has a persistent, pipelined connection with a base server, then retrieving a small amount of content from a different server, even if access to that server is faster than the base, may actually increase the total response time to retrieve all objects. However, if a significant amount of content does not need to be retrieved from the base server, but can be retrieved from a server closer to the client then the client should gain in performance if comparable protocol options are available from the servers.

In summary the study raises a number of issues for further investigation, but the results do point at some recommendations that we can make for clients and servers to improve end-to-end performance based upon the factors we studied.

6 Related Work

There have been many studies to examine Web performance from various perspectives. Nielsen, et al published the first work on measured performance of the HTTP/1.1 protocol. A related piece of work studied the performance interactions of persistent HTTP with TCP. Both of these studies found significant interaction problems that needed correction before persistent HTTP performed as expected. These works build on prior work to examine the impact of persistent connection HTTP [14].

More recent work has examined how bottlenecks in the network, CPU and the disk system affect the relative performance of HTTP/1.0 and HTTP/1.1 [2]. This work describes a controlled experiment to understand what relative impact of these subsystems on the protocol versions. One of the results of this work is a recommendation that HTTP/1.1 clients implement an ``early close'' policy where the client closes a connection after retrieving all objects associated with a Web page.

A number of other studies and tools directly examine the user-perceived performance of the Web. Keynote makes available a tool to visualize the performance of a Web retrieval [8]. Kruse, et al examine the impact of interleaving requests on user-perceived performance [12]. Gilbert and Brodersen demonstrate the benefits of progressive delivery of Web images in Web page rendering [6].

7 Summary and Future Work

In this work, we have examined factors contributing to end-to-end delay for a client's Web experience by performing a large scale study of popular Web sites. We have built on the PROCOW infrastructure by adding additional clients to the collection of client sites around the world. We have presented results on performance improvements due to the changes in the HTTP/1.1 protocol and for the impact of caching and multi-server content distribution in conjunction with different protocol options.

Our results show that the HTTP/1.1 protocol, particularly with pipelining, is indeed an improvement over existing practice, but that servers serving a small number of objects or closing a persistent connection without explicit notification can reduce or eliminate any performance improvement. Similarly, use of caching and multi-server content distribution can also improve performance if done effectively.

We believe that our work is a step in the right direction of measuring end-to-end performance. Our global testing infrastructure is solidifying and our process is largely automated. Clearly, further work is warranted. We expect more content diversification on the Web and the dependence on DNS for such methods needs to be examined more closely. We need to investigate the contribution of network aspects of the latency and see how they interact with the HTTP layer. This includes passive measurements to examine the variance against the observed measurements at the application layer. We also plan to test other types of target servers such as those responsible for server objects requiring a larger portion of the bandwidth usage and those serving dynamically generated objects.

Acknowledgements

The authors would like to thank several people who were helpful in getting us access to machines in many parts of the world--the study would not be possible without their help. They include Martin Arlitt, Alan Barrett, Steven Bellovin, Randy Bush, Jim Griffioen, Eduardo Krell, Anders Lund, Mark Murray, Scott Shenker, Graeme Yates. We thank Mikhail Mikhailov for assistance in setting up the testing framework and David Finkel for consultation on the analysis. We thank Bruce Maggs for answering questions related to Akamai.

Client
Site
Object
Count
Range
Range
Pct.
Pct.
Persistent
for Range
Ave. Retrieval Time (Sec.) (Ratio with Burst-1.0)
Serial-1.0 Burst-1.0 Serial-1.1 Burst-1.1
aciri 2-5 27% 52% 2.18 (1.0) 2.17 (1.0) 1.79 (0.8) 1.90 (0.9)
aciri 6-15 21% 29% 5.40 (2.7) 2.00 (1.0) 2.15 (1.1) 1.26 (0.6)
aciri 16+ 28% 20% 9.64 (2.7) 3.59 (1.0) 5.21 (1.5) 3.05 (0.9)
aciri Multi 76% 34% 4.62 (1.9) 2.45 (1.0) 2.64 (1.1) 2.01 (0.8)
aust 2-5 26% 48% 10.61 (1.4) 7.42 (1.0) 7.10 (1.0) 6.16 (0.8)
aust 6-15 21% 27% 27.28 (2.4) 11.40 (1.0) 13.24 (1.2) 10.93 (1.0)
aust 16+ 28% 18% 59.80 (2.8) 21.06 (1.0) 28.11 (1.3) 19.20 (0.9)
aust Multi 75% 31% 25.11 (2.2) 11.29 (1.0) 13.06 (1.2) 10.09 (0.9)
hp 2-5 26% 50% 1.61 (1.3) 1.20 (1.0) 1.69 (1.4) 1.37 (1.1)
hp 6-15 21% 27% 5.18 (2.2) 2.32 (1.0) 2.55 (1.1) 1.52 (0.7)
hp 16+ 28% 17% 10.22 (2.5) 4.07 (1.0) 3.80 (0.9) 2.45 (0.6)
hp Multi 75% 31% 4.24 (2.1) 2.06 (1.0) 2.33 (1.1) 1.63 (0.8)
uky 2-5 25% 49% 6.64 (1.3) 5.07 (1.0) 4.27 (0.8) 3.59 (0.7)
uky 6-15 20% 28% 14.00 (2.4) 5.80 (1.0) 5.62 (1.0) 3.88 (0.7)
uky 16+ 27% 18% 38.76 (2.8) 13.69 (1.0) 15.69 (1.1) 7.01 (0.5)
uky Multi 72% 32% 15.53 (2.2) 7.15 (1.0) 7.11 (1.0) 4.41 (0.6)
norway 2-5 26% 48% 3.50 (1.6) 2.19 (1.0) 2.65 (1.2) 2.23 (1.0)
norway 6-15 22% 9% 8.00 (2.1) 3.79 (1.0) 3.97 (1.0) 2.63 (0.7)
norway 16+ 27% 2% 19.18 (2.9) 6.57 (1.0) 11.24 (1.7) 8.09 (1.2)
norway Multi 75% 20% 4.55 (1.8) 2.53 (1.0) 3.09 (1.2) 2.46 (1.0)
safrica 2-5 26% 49% 14.22 (1.4) 10.45 (1.0) 10.35 (1.0) 10.71 (1.0)
safrica 6-15 21% 26% 35.43 (2.1) 16.95 (1.0) 22.36 (1.3) 17.20 (1.0)
safrica 16+ 27% 16% 83.90 (2.8) 29.79 (1.0) 47.13 (1.6) 30.28 (1.0)
safrica Multi 73% 31% 32.90 (2.1) 15.78 (1.0) 20.40 (1.3) 16.08 (1.0)
Table 8: Servers Exhibiting Perfect Connection Persistence from Additional Client Sites

Client
Site
Object
Count
Range
Range
Pct.
Pct.
Persistent
for Range
Ave. Retrieval Time (Sec.) (Ratio with Burst-1.0)
Serial-1.0 Burst-1.0 Serial-1.1 Burst-1.1
aciri 2-5 27% 5% 2.77 (0.9) 3.05 (1.0) 3.80 (1.2) 3.98 (1.3)
aciri 6-15 21% 16% 6.48 (2.0) 3.32 (1.0) 3.58 (1.1) 4.59 (1.4)
aciri 16+ 28% 13% 7.16 (2.7) 2.69 (1.0) 3.83 (1.4) 3.46 (1.3)
aciri Multi 76% 11% 6.16 (2.0) 3.01 (1.0) 3.72 (1.2) 4.01 (1.3)
aust 2-5 26% 8% 11.10 (1.5) 7.58 (1.0) 8.55 (1.1) 10.00 (1.3)
aust 6-15 21% 10% 40.56 (2.7) 15.00 (1.0) 23.51 (1.6) 20.52 (1.4)
aust 16+ 28% 8% 61.44 (3.1) 20.03 (1.0) 33.77 (1.7) 21.66 (1.1)
aust Multi 75% 9% 37.27 (2.6) 14.10 (1.0) 21.73 (1.5) 17.32 (1.2)
hp 2-5 26% 6% 1.75 (0.9) 2.05 (1.0) 3.29 (1.6) 3.19 (1.6)
hp 6-15 21% 13% 7.31 (2.1) 3.55 (1.0) 3.79 (1.1) 4.81 (1.4)
hp 16+ 28% 15% 9.94 (2.4) 4.16 (1.0) 4.32 (1.0) 5.32 (1.3)
hp Multi 75% 11% 7.62 (2.1) 3.58 (1.0) 3.96 (1.1) 4.77 (1.3)
uky 2-5 25% 7% 5.80 (1.9) 3.12 (1.0) 3.21 (1.0) 7.09 (2.3)
uky 6-15 20% 17% 19.65 (1.7) 11.86 (1.0) 12.63 (1.1) 10.68 (0.9)
uky 16+ 27% 17% 38.74 (2.5) 15.65 (1.0) 15.04 (1.0) 12.69 (0.8)
uky Multi 72% 14% 26.27 (2.2) 12.10 (1.0) 12.10 (1.0) 11.00 (0.9)
norway 2-5 26% 5% 4.87 (2.3) 2.10 (1.0) 4.54 (2.2) 11.06 (5.3)
norway 6-15 22% 9% 13.86 (2.2) 6.39 (1.0) 6.89 (1.1) 18.21 (2.8)
norway 16+ 27% 4% 14.89 (2.1) 7.26 (1.0) 9.09 (1.3) 16.80 (2.3)
norway Multi 75% 6% 11.53 (2.1) 5.38 (1.0) 6.79 (1.3) 15.77 (2.9)
safrica 2-5 26% 6% 29.21 (1.4) 20.57 (1.0) 28.13 (1.4) 25.19 (1.2)
safrica 6-15 21% 17% 50.55 (1.9) 26.31 (1.0) 38.97 (1.5) 29.21 (1.1)
safrica 16+ 27% 21% 84.44 (2.4) 34.57 (1.0) 58.05 (1.7) 46.57 (1.3)
safrica Multi 73% 14% 65.22 (2.2) 29.81 (1.0) 47.39 (1.6) 37.70 (1.3)
Table 9: Servers Exhibiting Imperfect Connection Persistence from Additional Client Sites

References

1
Akamai.
http://www.akamai.com.

2
Paul Barford and Mark Crovella. A performance evaluation of hyper text transfer protocols. In Proceedings of the ACM SIGMETRICS '99 Conference, Atlanta, Georgia, May 1999. ACM.

3
Pei Cao and Sandy Irani. Cost-aware WWW proxy caching algorithms. In Symposium on Internet Technology and Systems. USENIX Association, December 1997.
http://www.usenix.org/publications/library/proceedings/usits97/cao.html.

4
R. Fielding, J. Gettys, J. C. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee. Hypertext Transfer Protocol - HTTP/1.1. RFC 2616, HTTP Working Group, June 1999.
ftp://ftp.ietf.org/rfc2616.txt.

5
1999 Fortune 500 companies, Fortune volume 139 number 8, April 26 1999.

6
Jeffrey Gilbert and Robert Brodersen. Globally progressive interactive web delivery. In Proceedings of the IEEE Infocom '99 Conference, New York, NY, March 1999. IEEE.

7
1998 Global 500 companies, Fortune Magazine 1998.

8
Keynote lifeline.
http://lifeline.keynote.com/Lifeline/buyitonline/snapshot.asp.

9
Balachander Krishnamurthy and Martin Arlitt. PRO-COW: Protocol compliance on the web. Technical Report 990803-05-TM, AT&T Labs, August 1999.
http://www.research.att.com/~bala/papers/procow-1.ps.gz.

10
Balachander Krishnamurthy, Jeffrey C. Mogul, and David M. Kristol. Key differences between HTTP/1.0 and HTTP/1.1. In Eighth International World Wide Web Conference, Toronto, Canada, May 1999.
http://www.research.att.com/~bala/papers/h0vh1.ps.gz.

11
Balachander Krishnamurthy and Craig E. Wills. Piggyback server invalidation for proxy cache coherency. In Seventh International World Wide Web Conference, pages 185-193, Brisbane, Australia, April 1998. Published in Computer Networks and ISDN Systems (30)1-7 (1998) pp. 185-193.
http://www.cs.wpi.edu/~cew/papers/www7/www7.html.

12
Hans Kruse, Mark Allman, and Paul Mallasch. Network and user-perceived performance of web page retrievals. In Proceedings of the First International Conference on Telecommunications and Electronic Commerce, Nashville, TN USA, November 1998.
http://roland.lerc.nasa.gov/~mallman/papers/ecom98.ps.

13
Media metrix. http://www.mediametrix.com.

14
Jeffrey C. Mogul. The case for persistent-connection HTTP. In Proceedings of the ACM SIGCOMM '95 Conference. ACM, August 1995.
http://www.acm.org/sigcomm/sigcomm95/papers/mogul.html.

15
David Mosberger and Tai Jin. httperf - a tool for measuring web server performance. In Workshop on Internet Server Performance, Madison, Wisconsin USA, June 1998.
http://www.cs.wisc.edu/~cao/WISP98/final-versions/davidm.ps.

16
The netcraft web server survey. http://netcraft.co.uk/survey/.

17
Henrik Frystyk Nielsen, Jim Gettys, Anselm Baird-Smith, Eric Prud'hommeaux, Hikon Lie, and Chris Lilley. Network performance effects of HTTP/1.1, CSS1, and PNG. In Proceedings of the ACM SIGCOMM '97 Conference. ACM, September 1997.
http://www.acm.org/sigcomm/sigcomm97/papers/p102.html.

18
100hot.com. http://www.100hot.com.

19
Craig E. Wills and Mikhail Mikhailov. Towards a better understanding of web resources and server responses for improved caching. In Eighth International World Wide Web Conference, Toronto, Canada, May 1999.
http://www.cs.wpi.edu/~cew/papers/www8.ps.gz.

20
Craig E. Wills and Mikhail Mikhailov. Studying the impact of more complete server information on web caching. Technical Report WPI-CS-TR-99-36, Computer Science Department, Worcester Polytechnic Institute, November 1999.
http://www.cs.wpi.edu/~cew/papers/tr99-36.ps.gz.

Vitae

Balachander Krishnamurthy is a Member of Technical Staff at AT&T Labs--Research in Florham Park, New Jersey, USA.

Craig E. Wills is an associate professor in the Computer Science Department at Worcester Polytechnic Institute. His research interests include distributed computing, operating systems, networking and user interfaces.